skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: MLIMC: Machine learning-based implicit-solvent Monte Carlo
Monte Carlo (MC) methods are important computational tools for molecular structure optimizations and predictions. When solvent effects are explicitly considered, MC methods become very expensive due to the large degree of freedom associated with the water molecules and mobile ions. Alternatively implicit-solvent MC can largely reduce the computational cost by applying a mean field approximation to solvent effects and meanwhile maintains the atomic detail of the target molecule. The two most popular implicit-solvent models are the Poisson-Boltzmann (PB) model and the Generalized Born (GB) model in a way such that the GB model is an approximation to the PB model but is much faster in simulation time. In this work, we develop a machine learning-based implicit-solvent Monte Carlo (MLIMC) method by combining the advantages of both implicit solvent models in accuracy and efficiency. Specifically, the MLIMC method uses a fast and accurate PB-based machine learning (PBML) scheme to compute the electrostatic solvation free energy at each step. We validate our MLIMC method by using a benzene-water system and a protein-water system. We show that the proposed MLIMC method has great advantages in speed and accuracy for molecular structure optimization and prediction.  more » « less
Award ID(s):
1819193 2110922
PAR ID:
10373863
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Chinese Journal of Chemical Physics
Volume:
34
Issue:
6
ISSN:
1674-0068
Page Range / eLocation ID:
683 to 694
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We develop a hybrid approach that combines the Monte Carlo (MC)method, a variational implicit-solvent model (VISM), and a binary level-set method forthe simulation of biomolecular binding in an aqueous solvent. The solvation free energy for the biomolecular complex is estimated by minimizing the VISM free-energy functional of all possible solute−solvent interfaces that are used as dielectric boundaries. This functional consists of the solute volumetric, solute−solvent interfacial, solute−solvent van der Waals interaction, and electrostatic free energy. A technique of shifting the dielectric boundary is used to accurately predict the electrostatic part of the solvation free energy.Minimizing such a functional in each MC move is made possible by our new and fast binary level-set method. This method is based on the approximation of surface area by the convolution of an indicator function with a compactly supported kernel and is implemented by simple flips of numerical grid cells locally around the solute−solvent interface. We apply our approach to the p53-MDM2 system for which the two molecules are approximated by rigid bodies. Our efficient approach captures some of the poses before the final bound state. All atom molecular dynamics simulations with most of such poses quickly reach the final bound state.Our work is a new step toward realistic simulations of biomolecular interactions. With further improvement of coarse graining and MC sampling, and combined with other models, our hybrid approach can be used to study the free-energy landscape and kinetic pathways of ligand binding to proteins. 
    more » « less
  2. Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes second order methods such as limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) more effective. We study the use of randomized quasi-Monte Carlo (RQMC) sampling for such problems. When MC sampling has a root mean squared error (RMSE) of O(n−1/2) then RQMC has an RMSE of o(n−1/2) that can be close to O(n−3/2) in favorable settings. We prove that improved sampling accuracy translates directly to improved optimization. In our empirical investigations for variational Bayes, using RQMC with stochastic quasi-Newton method greatly speeds up the optimization, and sometimes finds a better parameter value than MC does. 
    more » « less
  3. Abstract This review spotlights the role of atomic‐level modeling in research on metal‐organic frameworks (MOFs), especially the key methodologies of density functional theory (DFT), Monte Carlo (MC) simulations, and molecular dynamics (MD) simulations. The discussion focuses on how periodic and cluster‐based DFT calculations can provide novel insights into MOF properties, with a focus on predicting structural transformations, understanding thermodynamic properties and catalysis, and providing information or properties that are fed into classical simulations such as force field parameters or partial charges. Classical simulation methods, highlighting force field selection, databases of MOFs for high‐throughput screening, and the synergistic nature of MC and MD simulations, are described. By predicting equilibrium thermodynamic and dynamic properties, these methods offer a wide perspective on MOF behavior and mechanisms. Additionally, the incorporation of machine learning (ML) techniques into quantum and classical simulations is discussed. These methods can enhance accuracy, expedite simulation setup, reduce computational costs, as well as predict key parameters, optimize geometries, and estimate MOF stability. By charting the growth and promise of computational research in the MOF field, the aim is to provide insights and recommendations to facilitate the incorporation of computational modeling more broadly into MOF research. 
    more » « less
  4. null (Ed.)
    • Water is the primary cellular solvent, yet is challenging to simulate computationally. Here we simulate water molecules in the Gramicidin A channel comparing Monte Carlo (MC) sampling with a continuum electrostatics and Molecular Dynamics (MD) calculations with the non-polarizable CHARMM36 and polarizable Drude force fields. • These give different water properties, with classical MD yielding well oriented water wires, while the Drude or continuum electrostatics force fields lead to more disordered water molecules, often changing orientation in the middle of the channel. 
    more » « less
  5. Abstract. Monte Carlo (MC) methods have been widely used in uncertainty analysis and parameter identification for hydrological models. The main challenge with these approaches is, however, the prohibitive number of model runs required to acquire an adequate sample size, which may take from days to months – especially when the simulations are run in distributed mode. In the past, emulators have been used to minimize the computational burden of the MC simulation through direct estimation of the residual-based response surfaces. Here, we apply emulators of an MC simulation in parameter identification for a distributed conceptual hydrological model using two likelihood measures, i.e. the absolute bias of model predictions (Score) and another based on the time-relaxed limits of acceptability concept (pLoA). Three machine-learning models (MLMs) were built using model parameter sets and response surfaces with a limited number of model realizations (4000). The developed MLMs were applied to predict pLoA and Score for a large set of model parameters (95 000). The behavioural parameter sets were identified using a time-relaxed limits of acceptability approach, based on the predicted pLoA values, and applied to estimate the quantile streamflow predictions weighted by their respective Score. The three MLMs were able to adequately mimic the response surfaces directly estimated from MC simulations with an R2 value of 0.7 to 0.92. Similarly, the models identified using the coupled machine-learning (ML) emulators and limits of acceptability approach have performed very well in reproducing the median streamflow prediction during the calibration and validation periods, with an average Nash–Sutcliffe efficiency value of 0.89 and 0.83, respectively. 
    more » « less