skip to main content

Title: A machine-learning-assisted study of the permeability of small drug-like molecules across lipid membranes
Study of the permeability of small organic molecules across lipid membranes plays a significant role in designing potential drugs in the field of drug discovery. Approaches to design promising drug molecules have gone through many stages, from experiment-based trail-and-error approaches, to the well-established avenue of the quantitative structure–activity relationship, and currently to the stage guided by machine learning (ML) and artificial intelligence techniques. In this work, we present a study of the permeability of small drug-like molecules across lipid membranes by two types of ML models, namely the least absolute shrinkage and selection operator (LASSO) and deep neural network (DNN) models. Molecular descriptors and fingerprints are used for featurization of organic molecules. Using molecular descriptors, the LASSO model uncovers that the electro-topological, electrostatic, polarizability, and hydrophobicity/hydrophilicity properties are the most important physical properties to determine the membrane permeability of small drug-like molecules. Additionally, with molecular fingerprints, the LASSO model suggests that certain chemical substructures can significantly affect the permeability of organic molecules, which closely connects to the identified main physical properties. Moreover, the DNN model using molecular fingerprints can help develop a more accurate mapping between molecular structures and their membrane permeability than LASSO models. Our results provide deep understanding more » of drug–membrane interactions and useful guidance for the inverse molecular design of drug-like molecules. Last but not least, while the current focus is on the permeability of drug-like molecules, the methodology of this work is general and can be applied for other complex physical chemistry problems to gain molecular insights. « less
; ;
Award ID(s):
Publication Date:
Journal Name:
Physical Chemistry Chemical Physics
Page Range or eLocation-ID:
19687 to 19696
Sponsoring Org:
National Science Foundation
More Like this
  1. Organic molecules and polymers have a broad range of applications in biomedical, chemical, and materials science fields. Traditional design approaches for organic molecules and polymers are mainly experimentally-driven, guided by experience, intuition, and conceptual insights. Though they have been successfully applied to discover many important materials, these methods are facing significant challenges due to the tremendous demand of new materials and vast design space of organic molecules and polymers. Accelerated and inverse materials design is an ideal solution to these challenges. With advancements in high-throughput computation, artificial intelligence (especially machining learning, ML), and the growth of materials databases, ML-assisted materials design is emerging as a promising tool to flourish breakthroughs in many areas of materials science and engineering. To date, using ML-assisted approaches, the quantitative structure property/activity relation for material property prediction can be established more accurately and efficiently. In addition, materials design can be revolutionized and accelerated much faster than ever, through ML-enabled molecular generation and inverse molecular design. In this perspective, we review the recent progresses in ML-guided design of organic molecules and polymers, highlight several successful examples, and examine future opportunities in biomedical, chemical, and materials science fields. We further discuss the relevant challenges to solve inmore »order to fully realize the potential of ML-assisted materials design for organic molecules and polymers. In particular, this study summarizes publicly available materials databases, feature representations for organic molecules, open-source tools for feature generation, methods for molecular generation, and ML models for prediction of material properties, which serve as a tutorial for researchers who have little experience with ML before and want to apply ML for various applications. Last but not least, it draws insights into the current limitations of ML-guided design of organic molecules and polymers. We anticipate that ML-assisted materials design for organic molecules and polymers will be the driving force in the near future, to meet the tremendous demand of new materials with tailored properties in different fields.« less
  2. Abstract

    Interatomic potentials derived with Machine Learning algorithms such as Deep-Neural Networks (DNNs), achieve the accuracy of high-fidelity quantum mechanical (QM) methods in areas traditionally dominated by empirical force fields and allow performing massive simulations. Most DNN potentials were parametrized for neutral molecules or closed-shell ions due to architectural limitations. In this work, we propose an improved machine learning framework for simulating open-shell anions and cations. We introduce the AIMNet-NSE (Neural Spin Equilibration) architecture, which can predict molecular energies for an arbitrary combination of molecular charge and spin multiplicity with errors of about 2–3 kcal/mol and spin-charges with error errors ~0.01e for small and medium-sized organic molecules, compared to the reference QM simulations. The AIMNet-NSE model allows to fully bypass QM calculations and derive the ionization potential, electron affinity, and conceptual Density Functional Theory quantities like electronegativity, hardness, and condensed Fukui functions. We show that these descriptors, along with learned atomic representations, could be used to model chemical reactivity through an example of regioselectivity in electrophilic aromatic substitution reactions.

  3. Using machine learning (ML) to develop quantitative structure—activity relationship (QSAR) models for contaminant reactivity has emerged as a promising approach because it can effectively handle non-linear relationships. However, ML is often data-demanding, whereas data scarcity is common in QSAR model development. Here, we proposed two approaches to address this issue: combining small datasets and transferring knowledge between them. First, we compiled four individual datasets for four oxidants, i.e., SO4•-, HClO, O3 and ClO2, each dataset containing a different number of contaminants with their corresponding rate constants and reaction conditions (pH and/or temperature). We then used molecular fingerprints (MF) or molecular descriptors (MD) to represent the contaminants; combined them with ML algorithms to develop individual QSAR models for these four datasets; and interpreted the models by the Shapley Additive exPlantion (SHAP) method. The results showed that both the optimal contaminant representation and the best ML algorithm are dataset dependent. Next, we merged these four datasets and developed a unified model, which showed better predictive performance on the datasets of HClO, O3 and ClO2 because the model ‘corrected’ some wrongly learned effects of several atom groups. We further developed knowledge transfer models based on the second approach, the effectiveness of which dependsmore »on if there is consistent knowledge shared between the two datasets as well as the predictive performance of the respective single models. This study demonstrated the benefit of combining small similar datasets and transferring knowledge between them, which can be leveraged to boost the predictive performance of ML-assisted QSAR models.« less
  4. Cell-based therapies have the potential to transform the treatment of many diseases. One of the key challenges relating to cell therapies is to modify the cell surface with molecules to modulate cell functions such as targeting, adhesion, migration, and cell–cell interactions, or to deliver drug cargos. Noncovalent insertion of lipid-based amphiphilic molecules on the cell surface is a rapid and nontoxic approach for modifying cells with a variety of bioactive molecules without affecting the cellular functions and viability. A wide variety of lipid amphiphiles, including proteins/peptides, carbohydrates, oligonucleotides, drugs, and synthetic polymers have been designed to spontaneously anchor on the plasma membranes. These molecules typically contain a functional component, a spacer, and a long chain diacyl lipid. Though these molecular constructs appeared to be stably tethered on cell surfaces both in vitro and in vivo under static situations, their stability under mechanical stress (e.g., in the blood flow) remains unclear. Using diacyl lipid-polyethylene glycol (lipo-PEG) conjugates as model amphiphiles, here we report the effect of molecular structures on the amphiphile stability on cell surface under mechanical stress. We analyzed the retention kinetics of lipo-PEGs on erythrocytes in vitro and in vivo and found that under mechanical stress, both the molecularmore »structures of lipid and the PEG spacer have a profound effect on the membrane retention of membrane-anchored amphiphiles. Our findings highlight the importance of molecular design on the dynamic stability of membrane-anchored amphiphiles.« less
  5. Claudins are cell-cell adhesion proteins within tight junctions that connect epithelial cells together. Claudins polymerize into a network of strand-like structures within the membrane of adjoining cells and create ion channels that control paracellular permeability to water and small molecules. Tight junction morphology and barrier function is tissue specific and regulated by claudin subtypes. Here, we present a molecular dynamics study of claudin-15 strands within lipid membranes and the role of a single-point mutation (A134P) on the third transmembrane helix (TM3) of claudin-15 in determining the morphology of the strand. Our results indicate that the A134P mutation significantly affects the lateral flexibility of the strands, increasing the persistence length of claudin-15 strands by a factor of three. Analyses of claudin-claudin contact in our μ second-long trajectories show that the mutation does not alter the intermolecular contacts (interfaces) between claudins. However, the dynamics and frequency of interfacial contacts are significantly affected. The A134P mutation introduces a kink in TM3 of claudin-15 similar to the one observed in claudin-3 crystal structure. The kink on TM3 skews the rotational flexibility of the claudins in the strands and limits their fluctuation in one direction. This asymmetric movement in the context of the double rowsmore »reduces the lateral flexibility of the strand and leads to higher persistence lengths of the mutant.« less