Computational simulation of biomolecules can provide important insights into protein design, protein-ligand binding interactions, and ab initio biomolecular folding, among other applications. Accurate treatment of the solvent environment is essential in such applications, but the use of explicit solvents can add considerable cost. Implicit treatment of solvent effects using a dielectric continuum model is an attractive alternative to explicit solvation since it is able to describe solvation effects without the inclusion of solvent degrees of freedom. Previously, we described the development and parameterization of implicit solvent models for small molecules. Here, we extend the parameterization of the generalized Kirkwood (GK) implicit solvent model for use with biomolecules described by the AMOEBA force field via the addition of corrections to the calculation of effective radii that account for interstitial spaces that arise within biomolecules. These include element-specific pairwise descreening scale factors, a short-range neck contribution to describe the solvent-excluded space between pairs of nearby atoms, and finally tanh-based rescaling of the overall descreening integral. We then apply the AMOEBA/GK implicit solvent to a set of ten proteins and achieve an average coordinate root mean square deviation for the experimental structures of 2.0 Å across 500 ns simulations. Overall, the continued development of implicit solvent models will help facilitate the simulation of biomolecules on mechanistically relevant timescales. 
                        more » 
                        « less   
                    
                            
                            A Physics-Guided Neural Network for Predicting Protein–Ligand Binding Free Energy: From Host–Guest Systems to the PDBbind Database
                        
                    
    
            Calculation of protein–ligand binding affinity is a cornerstone of drug discovery. Classic implicit solvent models, which have been widely used to accomplish this task, lack accuracy compared to experimental references. Emerging data-driven models, on the other hand, are often accurate yet not fully interpretable and also likely to be overfitted. In this research, we explore the application of Theory-Guided Data Science in studying protein–ligand binding. A hybrid model is introduced by integrating Graph Convolutional Network (data-driven model) with the GBNSR6 implicit solvent (physics-based model). The proposed physics-data model is tested on a dataset of 368 complexes from the PDBbind refined set and 72 host–guest systems. Results demonstrate that the proposed Physics-Guided Neural Network can successfully improve the “accuracy” of the pure data-driven model. In addition, the “interpretability” and “transferability” of our model have boosted compared to the purely data-driven model. Further analyses include evaluating model robustness and understanding relationships between the physical features. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2136095
- PAR ID:
- 10393683
- Date Published:
- Journal Name:
- Biomolecules
- Volume:
- 12
- Issue:
- 7
- ISSN:
- 2218-273X
- Page Range / eLocation ID:
- 919
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract The protein–ligand binding affinity quantifies the binding strength between a protein and its ligand. Computer modeling and simulations can be used to estimate the binding affinity or binding free energy using data- or physics-driven methods or a combination thereof. Here we discuss a purely physics-based sampling approach based on biased molecular dynamics simulations. Our proposed method generalizes and simplifies previously suggested stratification strategies that use umbrella sampling or other enhanced sampling simulations with additional collective-variable-based restraints. The approach presented here uses a flexible scheme that can be easily tailored for any system of interest. We estimate the binding affinity of human fibroblast growth factor 1 to heparin hexasaccharide based on the available crystal structure of the complex as the initial model and four different variations of the proposed method to compare against the experimentally determined binding affinity obtained from isothermal titration calorimetry experiments.more » « less
- 
            We propose a free energy calculation method for receptor–ligand binding, which have multiple binding poses that avoids exhaustive enumeration of the poses. For systems with multiple binding poses, the standard procedure is to enumerate orientations of the binding poses, restrain the ligand to each orientation, and then, calculate the binding free energies for each binding pose. In this study, we modify a part of the thermodynamic cycle in order to sample a broader conformational space of the ligand in the binding site. This modification leads to more accurate free energy calculation without performing separate free energy simulations for each binding pose. We applied our modification to simple model host–guest systems as a test, which have only two binding poses, by using a single decoupling method (SDM) in implicit solvent. The results showed that the binding free energies obtained from our method without knowing the two binding poses were in good agreement with the benchmark results obtained by explicit enumeration of the binding poses. Our method is applicable to other alchemical binding free energy calculation methods such as the double decoupling method (DDM) in explicit solvent. We performed a calculation for a protein–ligand system with explicit solvent using our modified thermodynamic path. The results of the free energy simulation along our modified path were in good agreement with the results of conventional DDM, which requires a separate binding free energy calculation for each of the binding poses of the example of phenol binding to T4 lysozyme in explicit solvent. © 2019 Wiley Periodicals, Inc.more » « less
- 
            Monte Carlo (MC) methods are important computational tools for molecular structure optimizations and predictions. When solvent effects are explicitly considered, MC methods become very expensive due to the large degree of freedom associated with the water molecules and mobile ions. Alternatively implicit-solvent MC can largely reduce the computational cost by applying a mean field approximation to solvent effects and meanwhile maintains the atomic detail of the target molecule. The two most popular implicit-solvent models are the Poisson-Boltzmann (PB) model and the Generalized Born (GB) model in a way such that the GB model is an approximation to the PB model but is much faster in simulation time. In this work, we develop a machine learning-based implicit-solvent Monte Carlo (MLIMC) method by combining the advantages of both implicit solvent models in accuracy and efficiency. Specifically, the MLIMC method uses a fast and accurate PB-based machine learning (PBML) scheme to compute the electrostatic solvation free energy at each step. We validate our MLIMC method by using a benzene-water system and a protein-water system. We show that the proposed MLIMC method has great advantages in speed and accuracy for molecular structure optimization and prediction.more » « less
- 
            null (Ed.)Servo error pre-compensation (SEP) is commonly used to improve the accuracy of feed drives. Existing SEP approaches often involve the use of physics-based linear models (e.g., transfer functions) to predict servo errors, but suffer from inaccuracies due to unmodeled nonlinear dynamics in feed drives. This paper proposes a linear hybrid model for SEP that combines physics-based and data-driven linear models. The proposed model is shown to approximate nonlinearities unmodeled in physics-based linear models. In experiments on a precision feed drive, the proposed hybrid model improves the accuracy of servo error prediction by up to 38% compared to a physics-based model.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    