skip to main content


Title: MLIMC: Machine learning-based implicit-solvent Monte Carlo
Monte Carlo (MC) methods are important computational tools for molecular structure optimizations and predictions. When solvent effects are explicitly considered, MC methods become very expensive due to the large degree of freedom associated with the water molecules and mobile ions. Alternatively implicit-solvent MC can largely reduce the computational cost by applying a mean field approximation to solvent effects and meanwhile maintains the atomic detail of the target molecule. The two most popular implicit-solvent models are the Poisson-Boltzmann (PB) model and the Generalized Born (GB) model in a way such that the GB model is an approximation to the PB model but is much faster in simulation time. In this work, we develop a machine learning-based implicit-solvent Monte Carlo (MLIMC) method by combining the advantages of both implicit solvent models in accuracy and efficiency. Specifically, the MLIMC method uses a fast and accurate PB-based machine learning (PBML) scheme to compute the electrostatic solvation free energy at each step. We validate our MLIMC method by using a benzene-water system and a protein-water system. We show that the proposed MLIMC method has great advantages in speed and accuracy for molecular structure optimization and prediction.  more » « less
Award ID(s):
1819193 2110922
NSF-PAR ID:
10373863
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Chinese Journal of Chemical Physics
Volume:
34
Issue:
6
ISSN:
1674-0068
Page Range / eLocation ID:
683 to 694
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We develop a hybrid approach that combines the Monte Carlo (MC)method, a variational implicit-solvent model (VISM), and a binary level-set method forthe simulation of biomolecular binding in an aqueous solvent. The solvation free energy for the biomolecular complex is estimated by minimizing the VISM free-energy functional of all possible solute−solvent interfaces that are used as dielectric boundaries. This functional consists of the solute volumetric, solute−solvent interfacial, solute−solvent van der Waals interaction, and electrostatic free energy. A technique of shifting the dielectric boundary is used to accurately predict the electrostatic part of the solvation free energy.Minimizing such a functional in each MC move is made possible by our new and fast binary level-set method. This method is based on the approximation of surface area by the convolution of an indicator function with a compactly supported kernel and is implemented by simple flips of numerical grid cells locally around the solute−solvent interface. We apply our approach to the p53-MDM2 system for which the two molecules are approximated by rigid bodies. Our efficient approach captures some of the poses before the final bound state. All atom molecular dynamics simulations with most of such poses quickly reach the final bound state.Our work is a new step toward realistic simulations of biomolecular interactions. With further improvement of coarse graining and MC sampling, and combined with other models, our hybrid approach can be used to study the free-energy landscape and kinetic pathways of ligand binding to proteins. 
    more » « less
  2. Abstract Purpose

    Most commercially available treatment planning systems (TPSs) approximate the continuous delivery of volumetric modulated arc therapy (VMAT) plans with a series of discretized static beams for treatment planning, which can make VMAT dose computation extremely inefficient. In this study, we developed a polar‐coordinate‐based pencil beam (PB) algorithm for efficient VMAT dose computation with high‐resolution gantry angle sampling that can improve the computational efficiency and reduce the dose discrepancy due to the angular under‐sampling effect.

    Methods and Materials

    6 MV pencil beams were simulated on a uniform cylindrical phantom under an EGSnrc Monte Carlo (MC) environment. The MC‐generated PB kernels were collected in the polar coordinate system for each bixel on a fluence map and subsequently fitted via a series of Gaussians. The fluence was calculated using a detectors’ eye view with off‐axis and MLC transmission factors corrected. Doses of VMAT arc on the phantom were computed by summing the convolution results between the corresponding PB kernels and fluence for each bixel in the polar coordinate system. The convolution was performed using fast Fourier transform to expedite the computing speed. The calculated doses were converted to the Cartesian coordinate system and compared with the reference dose computed by a collapsed cone convolution (CCC) algorithm of the TPS. A heterogeneous phantom was created to study the heterogeneity corrections using the proposed algorithm. Ten VMAT arcs were included to evaluate the algorithm performance. Gamma analysis and computation complexity theory were used to measure the dosimetric accuracy and computational efficiency, respectively.

    Results

    The dosimetric comparisons on the homogeneous phantom between the proposed PB algorithm and the CCC algorithm for 10 VMAT arcs demonstrate that the proposed algorithm can achieve a dosimetric accuracy comparable to that of the CCC algorithm with average gamma passing rates of 96% (2%/2mm) and 98% (3%/3mm). In addition, the proposed algorithm can provide better computational efficiency for VMAT dose computation using a PC equipped with a 4‐core processor, compared to the CCC algorithm utilizing a dual 10‐core server. Moreover, the computation complexity theory reveals that the proposed algorithm has a great advantage with regard to computational efficiency for VMAT dose computation on homogeneous medium, especially when a fine angular sampling rate is applied. This can support a reduction in dose errors from the angular under‐sampling effect by using a finer angular sampling rate, while still preserving a practical computing speed. For dose calculation on the heterogeneous phantom, the proposed algorithm with heterogeneity corrections can still offer a reasonable dosimetric accuracy with comparable computational efficiency to that of the CCC algorithm.

    Conclusions

    We proposed a novel polar‐coordinate‐based pencil beam algorithm for VMAT dose computation that enables a better computational efficiency while maintaining clinically acceptable dosimetric accuracy and reducing dose error caused by the angular under‐sampling effect. It also provides a flexible VMAT dose computation structure that allows adjustable sampling rates and direct dose computation in regions of interest, which makes the algorithm potentially useful for clinical applications such as independent dose verification for VMAT patient‐specific QA.

     
    more » « less
  3. Abstract

    Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch.

     
    more » « less
  4. Abstract

    This review spotlights the role of atomic‐level modeling in research on metal‐organic frameworks (MOFs), especially the key methodologies of density functional theory (DFT), Monte Carlo (MC) simulations, and molecular dynamics (MD) simulations. The discussion focuses on how periodic and cluster‐based DFT calculations can provide novel insights into MOF properties, with a focus on predicting structural transformations, understanding thermodynamic properties and catalysis, and providing information or properties that are fed into classical simulations such as force field parameters or partial charges. Classical simulation methods, highlighting force field selection, databases of MOFs for high‐throughput screening, and the synergistic nature of MC and MD simulations, are described. By predicting equilibrium thermodynamic and dynamic properties, these methods offer a wide perspective on MOF behavior and mechanisms. Additionally, the incorporation of machine learning (ML) techniques into quantum and classical simulations is discussed. These methods can enhance accuracy, expedite simulation setup, reduce computational costs, as well as predict key parameters, optimize geometries, and estimate MOF stability. By charting the growth and promise of computational research in the MOF field, the aim is to provide insights and recommendations to facilitate the incorporation of computational modeling more broadly into MOF research.

     
    more » « less
  5. Abstract. Monte Carlo (MC) methods have been widely used in uncertainty analysis and parameter identification for hydrological models. The main challenge with these approaches is, however, the prohibitive number of model runs required to acquire an adequate sample size, which may take from days to months – especially when the simulations are run in distributed mode. In the past, emulators have been used to minimize the computational burden of the MC simulation through direct estimation of the residual-based response surfaces. Here, we apply emulators of an MC simulation in parameter identification for a distributed conceptual hydrological model using two likelihood measures, i.e. the absolute bias of model predictions (Score) and another based on the time-relaxed limits of acceptability concept (pLoA). Three machine-learning models (MLMs) were built using model parameter sets and response surfaces with a limited number of model realizations (4000). The developed MLMs were applied to predict pLoA and Score for a large set of model parameters (95 000). The behavioural parameter sets were identified using a time-relaxed limits of acceptability approach, based on the predicted pLoA values, and applied to estimate the quantile streamflow predictions weighted by their respective Score. The three MLMs were able to adequately mimic the response surfaces directly estimated from MC simulations with an R2 value of 0.7 to 0.92. Similarly, the models identified using the coupled machine-learning (ML) emulators and limits of acceptability approach have performed very well in reproducing the median streamflow prediction during the calibration and validation periods, with an average Nash–Sutcliffe efficiency value of 0.89 and 0.83, respectively. 
    more » « less