skip to main content

Title: Breaking the polar‐nonpolar division in solvation free energy prediction

Implicit solvent models divide solvation free energies into polar and nonpolar additive contributions, whereas polar and nonpolar interactions are inseparable and nonadditive. We present a feature functional theory (FFT) framework to break thisad hocdivision. The essential ideas of FFT are as follows: (i) representability assumption: there exists a microscopic feature vector that can uniquely characterize and distinguish one molecule from another; (ii) feature‐function relationship assumption: the macroscopic features, including solvation free energy, of a molecule is a functional of microscopic feature vectors; and (iii) similarity assumption: molecules with similar microscopic features have similar macroscopic properties, such as solvation free energies. Based on these assumptions, solvation free energy prediction is carried out in the following protocol. First, we construct a molecular microscopic feature vector that is efficient in characterizing the solvation process using quantum mechanics and Poisson–Boltzmann theory. Microscopic feature vectors are combined with macroscopic features, that is, physical observable, to form extended feature vectors. Additionally, we partition a solvation dataset into queries according to molecular compositions. Moreover, for each target molecule, we adopt a machine learning algorithm for its nearest neighbor search, based on the selected microscopic feature vectors. Finally, from the extended feature vectors of obtained nearest neighbors, we construct a functional of solvation free energy, which is employed to predict the solvation free energy of the target molecule. The proposed FFT model has been extensively validated via a large dataset of 668 molecules. The leave‐one‐out test gives an optimal root‐mean‐square error (RMSE) of 1.05 kcal/mol. FFT predictions of SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 challenge sets deliver the RMSEs of 0.61, 1.86, 1.64, 0.86, and 1.14 kcal/mol, respectively. Using a test set of 94 molecules and its associated training set, the present approach was carefully compared with a classic solvation model based on weighted solvent accessible surface area. © 2017 Wiley Periodicals, Inc.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Computational Chemistry
Page Range / eLocation ID:
p. 217-233
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hydration free energies of small molecules are commonly used as benchmarks for solvation models. However, errors in predicting hydration free energies are partially due to the force fields used and not just the solvation model. To address this, we have used the 3D reference interaction site model (3D-RISM) of molecular solvation and existing benchmark explicit solvent calculations with a simple element count correction (ECC) to identify problems with the non-bond parameters in the general AMBER force field (GAFF). 3D-RISM was used to calculate hydration free energies of all 642 molecules in the FreeSolv database, and a partial molar volume correction (PMVC), ECC, and their combination (PMVECC) were applied to the results. The PMVECC produced a mean unsigned error of 1.01±0.04kcal/mol and root mean squared error of 1.44±0.07kcal/mol, better than the benchmark explicit solvent calculations from FreeSolv, and required less than 15 s of computing time per molecule on a single CPU core. Importantly, parameters for PMVECC showed systematic errors for molecules containing Cl, Br, I, and P. Applying ECC to the explicit solvent hydration free energies found the same systematic errors. The results strongly suggest that some small adjustments to the Lennard–Jones parameters for GAFF will lead to improved hydration free energy calculations for all solvent models. 
    more » « less
  2. Abstract

    Accurate estimation of solvation free energy (SFE) lays the foundation for accurate prediction of binding free energy. The Poisson‐Boltzmann (PB) or generalized Born (GB) combined with surface area (SA) continuum solvation method (PBSA and GBSA) have been widely used in SFE calculations because they can achieve good balance between accuracy and efficiency. However, the accuracy of these methods can be affected by several factors such as the charge models, polar and nonpolar SFE calculation methods and the atom radii used in the calculation. In this work, the performance of the ABCG2 (AM1‐BCC‐GAFF2) charge model as well as other two charge models, that is, RESP (Restrained Electrostatic Potential) and AM1‐BCC (Austin Model 1‐bond charge corrections), on the SFE prediction of 544 small molecules in water by PBSA/GBSA was evaluated. In order to improve the performance of the PBSA prediction based on the ABCG2 charge, we further explored the influence of atom radii on the prediction accuracy and yielded a set of atom radius parameters for more accurate SFE prediction using PBSA based on the ABCG2/GAFF2 by reproducing the thermodynamic integration (TI) calculation results. The PB radius parameters of carbon, oxygen, sulfur, phosphorus, chloride, bromide and iodine, were adjusted. New atom types,on,oi,hn1,hn2,hn3, were introduced to further improve the fitting performance. Then, we tuned the parameters in the nonpolar SFE model using the experimental SFE data and the PB calculation results. By adopting the new radius parameters and new nonpolar SFE model, the root mean square error (RMSE) of the SFE calculation for the 544 molecules decreased from 2.38 to 1.05 kcal/mol. Finally, the new radius parameters were applied in the prediction of protein‐ligand binding free energies using the MM‐PBSA method. For the eight systems tested, we could observe higher correlation between the experiment data and calculation results and smaller prediction errors for the absolute binding free energies, demonstrating that our new radius parameters can improve the free energy calculation using the MM‐PBSA method.

    more » « less
  3. Abstract

    We demonstrate that the solvation‐layer interface condition (SLIC) continuum dielectric model for molecular electrostatics, combined with a simple solvent‐accessible‐surface‐area (SASA)‐proportional model for nonpolar solvent effects, accurately predicts solvation entropies of neutral and charged small molecules. The SLIC/SASA model has only seven fitting parameters in total and achieves this accuracy using a training set with only 20 compounds. Despite this simplicity, solvation free energies and entropies are nearly as accurate as those predicted by the more sophisticated Langevin dipoles solvation model. Surprisingly, the model automatically reproduces the negligible contribution of electrostatics to the solvation of hydrophobic compounds. Opportunities for improvement include nonpolar solvation, anion solvation entropies, and heat capacities. More molecular realism may be needed for these quantities. To enable a future, explicit‐solvent‐based assessment of the SLIC/SASA implicit‐solvent model, we predict solvation entropies for the Mobley test set, which are available as Supporting Information.

    more » « less
  4. Abstract

    Solvation effects profoundly influence the characteristics and behavior of chemical systems in liquid solutions. The interaction between solute and solvent molecules intricately impacts solubility, reactivity, stability, and various chemical processes. Continuum solvation models gained prominence in quantum chemistry by implicitly capturing these interactions and enabling efficient investigations of diverse chemical systems in solution. In comparison, continuum solvation models in condensed matter simulation are very recent. Among these, the self‐consistent continuum solvation (SCCS) and the soft‐sphere continuum solvation models (SSCS) have been among the first to be successfully parameterized and extended to model periodic systems in aqueous solutions and electrolytes. As most continuum approaches, these models depend on a number of parameters that are linked to experimental or theoretical properties of the solvent, or that can be tuned based on reference data. Here, we present a systematic parameterization of the SSCS model for over 100 nonaqueous solvents. We validate the model's efficacy across diverse solvent environments by leveraging experimental solvation‐free energies and partition coefficients from comprehensive databases. The average root means square error over all the solvents was calculated as 0.85 kcal/mol which is below the chemical accuracy (1 kcal/mol). Similarly to what has been reported by Hille et al. (J. Chem. Phys.2019,150, 041710.) for the SCCS model, a single‐parameter model accurately reproduces experimental solvation energies, showcasing the transferability and predictive power of these continuum approaches. Our findings underscore the potential for a unified approach to predict solvation properties, paving the way for enhanced computational studies across various chemical environments.

    more » « less
  5. High-level quantum chemical computations have provided significant insight into the fundamental physical nature of non-covalent interactions. These studies have focused primarily on gas-phase computations of small van der Waals dimers; however, these interactions frequently take place in complex chemical environments, such as proteins, solutions, or solids. To better understand how the chemical environment affects non-covalent interactions, we have undertaken a quantum chemical study of π– π interactions in an aqueous solution, as exemplified by T-shaped benzene dimers surrounded by 28 or 50 explicit water molecules. We report interaction energies (IEs) using second-order Møller–Plesset perturbation theory, and we apply the intramolecular and functional-group partitioning extensions of symmetry-adapted perturbation theory (ISAPT and F-SAPT, respectively) to analyze how the solvent molecules tune the π– π interactions of the solute. For complexes containing neutral monomers, even 50 explicit waters (constituting a first and partial second solvation shell) change total SAPT IEs between the two solute molecules by only tenths of a kcal mol −1 , while significant changes of up to 3 kcal mol −1 of the electrostatic component are seen for the cationic pyridinium–benzene dimer. This difference between charged and neutral solutes is attributed to large non-additive three-body interactions within solvated ion-containing complexes. Overall, except for charged solutes, our quantum computations indicate that nearby solvent molecules cause very little “tuning” of the direct solute–solute interactions. This indicates that differences in binding energies between the gas phase and solution phase are primarily indirect effects of the competition between solute–solute and solute–solvent interactions. 
    more » « less