skip to main content


Title: Development and test of highly accurate endpoint free energy methods. 2: Prediction of logarithm of n ‐octanol–water partition coefficient ( logP ) for druglike molecules using MM‐PBSA method
Abstract

The logarithm ofn‐octanol–water partition coefficient (logP) is frequently used as an indicator of lipophilicity in drug discovery, which has substantial impacts on the absorption, distribution, metabolism, excretion, and toxicity of a drug candidate. Considering that the experimental measurement of the property is costly and time‐consuming, it is of great importance to develop reliable prediction models for logP. In this study, we developed a transfer free energy‐based logP prediction model‐FElogP. FElogP is based on the simple principle that logP is determined by the free energy change of transferring a molecule from water ton‐octanol. The underlying physical method to calculate transfer free energy is the molecular mechanics‐Poisson Boltzmann surface area (MM‐PBSA), thus this method is named as free energy‐based logP (FElogP). The superiority of FElogP model was validated by a large set of 707 structurally diverse molecules in the ZINC database for which the measurement was of high quality. Encouragingly, FElogP outperformed several commonly‐used QSPR or machine learning‐based logP models, as well as some continuum solvation model‐based methods. The root‐mean‐square error (RMSE) and Pearson correlation coefficient (R) between the predicted and measured values are 0.91 log units and 0.71, respectively, while the runner‐up, the logP model implemented in OpenBabel had an RMSE of 1.13 log units and R of 0.67. Given the fact that FElogP was not parameterized against experimental logP directly, its excellent performance is likely to be expanded to arbitrary organic molecules covered by the general AMBER force fields.

 
more » « less
Award ID(s):
1955260
NSF-PAR ID:
10398582
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Computational Chemistry
Volume:
44
Issue:
13
ISSN:
0192-8651
Page Range / eLocation ID:
p. 1300-1311
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    MF-LOGP, a new method for determining a single component octanol–water partition coefficients ($$LogP$$LogP) is presented which uses molecular formula as the only input. Octanol–water partition coefficients are useful in many applications, ranging from environmental fate and drug delivery. Currently, partition coefficients are either experimentally measured or predicted as a function of structural fragments, topological descriptors, or thermodynamic properties known or calculated from precise molecular structures. The MF-LOGP method presented here differs from classical methods as it does not require any structural information and uses molecular formula as the sole model input. MF-LOGP is therefore useful for situations in which the structure is unknown or where the use of a low dimensional, easily automatable, and computationally inexpensive calculations is required. MF-LOGP is a random forest algorithm that is trained and tested on 15,377 data points, using 10 features derived from the molecular formula to make$$LogP$$LogPpredictions. Using an independent validation set of 2713 data points, MF-LOGP was found to have an average$$RMSE$$RMSE= 0.77 ± 0.007,$$MAE$$MAE= 0.52 ± 0.003, and$${R}^{2}$$R2= 0.83 ± 0.003. This performance fell within the spectrum of performances reported in the published literature for conventional higher dimensional models ($$RMSE$$RMSE= 0.42–1.54,$$MAE$$MAE= 0.09–1.07, and$${R}^{2}$$R2= 0.32–0.95). Compared with existing models, MF-LOGP requires a maximum of ten features and no structural information, thereby providing a practical and yet predictive tool. The development of MF-LOGP provides the groundwork for development of more physical prediction models leveraging big data analytical methods or complex multicomponent mixtures.

    Graphical Abstract

     
    more » « less
  2. Abstract

    Accurate estimation of solvation free energy (SFE) lays the foundation for accurate prediction of binding free energy. The Poisson‐Boltzmann (PB) or generalized Born (GB) combined with surface area (SA) continuum solvation method (PBSA and GBSA) have been widely used in SFE calculations because they can achieve good balance between accuracy and efficiency. However, the accuracy of these methods can be affected by several factors such as the charge models, polar and nonpolar SFE calculation methods and the atom radii used in the calculation. In this work, the performance of the ABCG2 (AM1‐BCC‐GAFF2) charge model as well as other two charge models, that is, RESP (Restrained Electrostatic Potential) and AM1‐BCC (Austin Model 1‐bond charge corrections), on the SFE prediction of 544 small molecules in water by PBSA/GBSA was evaluated. In order to improve the performance of the PBSA prediction based on the ABCG2 charge, we further explored the influence of atom radii on the prediction accuracy and yielded a set of atom radius parameters for more accurate SFE prediction using PBSA based on the ABCG2/GAFF2 by reproducing the thermodynamic integration (TI) calculation results. The PB radius parameters of carbon, oxygen, sulfur, phosphorus, chloride, bromide and iodine, were adjusted. New atom types,on,oi,hn1,hn2,hn3, were introduced to further improve the fitting performance. Then, we tuned the parameters in the nonpolar SFE model using the experimental SFE data and the PB calculation results. By adopting the new radius parameters and new nonpolar SFE model, the root mean square error (RMSE) of the SFE calculation for the 544 molecules decreased from 2.38 to 1.05 kcal/mol. Finally, the new radius parameters were applied in the prediction of protein‐ligand binding free energies using the MM‐PBSA method. For the eight systems tested, we could observe higher correlation between the experiment data and calculation results and smaller prediction errors for the absolute binding free energies, demonstrating that our new radius parameters can improve the free energy calculation using the MM‐PBSA method.

     
    more » « less
  3. Abstract

    A next‐generation protocol (Poltype 2) has been developed which automatically generates AMOEBA polarizable force field parameters for small molecules. Both features and computational efficiency have been drastically improved. Notable advances include improved database transferability using SMILES, robust torsion fitting, non‐aromatic ring torsion parameterization, coupled torsion‐torsion parameterization, Van der Waals parameter refinement using ab initio dimer data and an intelligent fragmentation scheme that produces parameters with dramatically reduced ab initio computational cost. Additional improvements include better local frame assignment for atomic multipoles, automated formal charge assignment, Zwitterion detection, smart memory resource defaults, parallelized fragment job submission, incorporation of Psi4 quantum package, ab initio error handling, ionization state enumeration, hydration free energy prediction and binding free energy prediction. For validation, we have applied Poltype 2 to ~1000 FDA approved drug molecules from DrugBank. The ab initio molecular dipole moments and electrostatic potential values were compared with Poltype 2 derived AMOEBA counterparts. Parameters were further substantiated by calculating hydration free energy (HFE) on 40 small organic molecules and were compared with experimental data, resulting in an RMSE error of 0.59 kcal/mol. The torsion database has expanded to include 3543 fragments derived from FDA approved drugs. Poltype 2 provides a convenient utility for applications including binding free energy prediction for computational drug discovery. Further improvement will focus on automated parameter refinement by experimental liquid properties, expansion of the Van der Waals parameter database and automated parametrization of modified bio‐fragments such as amino and nucleic acids.

     
    more » « less
  4. Abstract

    This work examines methods for predicting the partition coefficient (logP) for a dataset of small molecules. Here, we use atomic attributes such as radius and partial charge, which are typically used as force field parameters in classical molecular dynamics simulations. These atomic attributes are transformed into index‐invariant molecular features using a recently developed method called geometric scattering for graphs (GSG). We call this approach “ClassicalGSG” and examine its performance under a broad range of conditions and hyperparameters. We train ClassicalGSG logPpredictors with neural networks using 10,722 molecules from the OpenChem dataset and apply them to predict the logPvalues from four independent test sets. The ClassicalGSG method's performance is compared to a baseline model that employs graph convolutional networks. Our results show that the best prediction accuracies are obtained using atomic attributes generated with the CHARMM generalized force field and 2D molecular structures.

     
    more » « less
  5. Abstract

    Soil thermal conductivity (λ) is an important thermal property for environmental, agricultural, and engineering heat transfer applications. Existing λ models for frozen soils are complicated to use because they require estimates of both liquid water content and ice content. This study introduces a new approach to estimate λ of partially frozen soils from air‐filled porosity (na), which can be determined by using an oven‐drying method. A λ andnarelationship was established based on measurements for 28 partially frozen soils. A strong exponential relationship between λ andnawas found (withR2of 0.82). Independent tests on 10 partially frozen soils showed that the exponential λ–namodel produced reliable λ estimates with a RMSE of 0.319 W m−1K−1, which was smaller than those of two widely used λ models for partially frozen soils. The λ–namodel is easier to use than existing models, because it requires fewer parameters. Note that the λ‐namodel ignores the effect of temperature on λ of frozen soils and is most applicable to soil at temperatures of at least −4 °C.

     
    more » « less