skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A deep neural network model for packing density predictions and its application in the study of 1.5 million organic molecules
The process of developing new compounds and materials is increasingly driven by computational modeling and simulation, which allow us to characterize candidates before pursuing them in the laboratory. One of the non-trivial properties of interest for organic materials is their packing in the bulk, which is highly dependent on their molecular structure. By controlling the latter, we can realize materials with a desired density (as well as other target properties). Molecular dynamics simulations are a popular and reasonably accurate way to compute the bulk density of molecules, however, since these calculations are computationally intensive, they are not a practically viable option for high-throughput screening studies that assess material candidates on a massive scale. In this work, we employ machine learning to develop a data-derived prediction model that is an alternative to physics-based simulations, and we utilize it for the hyperscreening of 1.5 million small organic molecules as well as to gain insights into the relationship between structural makeup and packing density. We also use this study to analyze the learning curve of the employed neural network approach and gain empirical data on the dependence of model performance and training data size, which will inform future investigations.  more » « less
Award ID(s):
1751161
PAR ID:
10144789
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Chemical Science
Volume:
10
Issue:
36
ISSN:
2041-6520
Page Range / eLocation ID:
8374 to 8383
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Study of the permeability of small organic molecules across lipid membranes plays a significant role in designing potential drugs in the field of drug discovery. Approaches to design promising drug molecules have gone through many stages, from experiment-based trail-and-error approaches, to the well-established avenue of the quantitative structure–activity relationship, and currently to the stage guided by machine learning (ML) and artificial intelligence techniques. In this work, we present a study of the permeability of small drug-like molecules across lipid membranes by two types of ML models, namely the least absolute shrinkage and selection operator (LASSO) and deep neural network (DNN) models. Molecular descriptors and fingerprints are used for featurization of organic molecules. Using molecular descriptors, the LASSO model uncovers that the electro-topological, electrostatic, polarizability, and hydrophobicity/hydrophilicity properties are the most important physical properties to determine the membrane permeability of small drug-like molecules. Additionally, with molecular fingerprints, the LASSO model suggests that certain chemical substructures can significantly affect the permeability of organic molecules, which closely connects to the identified main physical properties. Moreover, the DNN model using molecular fingerprints can help develop a more accurate mapping between molecular structures and their membrane permeability than LASSO models. Our results provide deep understanding of drug–membrane interactions and useful guidance for the inverse molecular design of drug-like molecules. Last but not least, while the current focus is on the permeability of drug-like molecules, the methodology of this work is general and can be applied for other complex physical chemistry problems to gain molecular insights. 
    more » « less
  2. Organic molecules and polymers have a broad range of applications in biomedical, chemical, and materials science fields. Traditional design approaches for organic molecules and polymers are mainly experimentally-driven, guided by experience, intuition, and conceptual insights. Though they have been successfully applied to discover many important materials, these methods are facing significant challenges due to the tremendous demand of new materials and vast design space of organic molecules and polymers. Accelerated and inverse materials design is an ideal solution to these challenges. With advancements in high-throughput computation, artificial intelligence (especially machining learning, ML), and the growth of materials databases, ML-assisted materials design is emerging as a promising tool to flourish breakthroughs in many areas of materials science and engineering. To date, using ML-assisted approaches, the quantitative structure property/activity relation for material property prediction can be established more accurately and efficiently. In addition, materials design can be revolutionized and accelerated much faster than ever, through ML-enabled molecular generation and inverse molecular design. In this perspective, we review the recent progresses in ML-guided design of organic molecules and polymers, highlight several successful examples, and examine future opportunities in biomedical, chemical, and materials science fields. We further discuss the relevant challenges to solve in order to fully realize the potential of ML-assisted materials design for organic molecules and polymers. In particular, this study summarizes publicly available materials databases, feature representations for organic molecules, open-source tools for feature generation, methods for molecular generation, and ML models for prediction of material properties, which serve as a tutorial for researchers who have little experience with ML before and want to apply ML for various applications. Last but not least, it draws insights into the current limitations of ML-guided design of organic molecules and polymers. We anticipate that ML-assisted materials design for organic molecules and polymers will be the driving force in the near future, to meet the tremendous demand of new materials with tailored properties in different fields. 
    more » « less
  3. Zwitterionic materials are an important class of antifouling biomaterials for various applications. Despite such desirable antifouling properties, molecular-level understanding of the structure–property relationship associated with surface chemistry/topology/hydration and antifouling performance still remains to be elucidated. In this work, we computationally studied the packing structure, surface hydration, and antifouling property of three zwitterionic polymer brushes of poly(carboxybetaine methacrylate) (pCBMA), poly(sulfobetaine methacrylate) (pSBMA), and poly((2-(methacryloyloxy)ethyl)phosporylcoline) (pMPC) brushes and a hydrophilic PEG brush using a combination of molecular mechanics (MM), Monte Carlo (MC), molecular dynamics (MD), and steered MD (SMD) simulations. We for the first time determined the optimal packing structures of all polymer brushes from a wide variety of unit cells and chain orientations in a complex energy landscape. Under the optimal packing structures, MD simulations were further conducted to study the structure, dynamics, and orientation of water molecules and protein adsorption on the four polymer brushes, while SMD simulations to study the surface resistance of the polymer brushes to a protein. The collective results consistently revealed that the three zwitterionic brushes exhibited stronger interactions with water molecules and higher surface resistance to a protein than the PEG brush. It was concluded that both the carbon space length between zwitterionic groups and the nature of the anionic groups have a distinct effect on the antifouling performance, leading to the following antifouling ranking of pCBMA > pMPC > pSBMA. This work hopefully provides some structural insights into the design of new antifouling materials beyond traditional PEG-based antifouling materials. 
    more » « less
  4. null (Ed.)
    Introducing charge carriers is of paramount importance for increasing the efficiency of organic semiconducting materials. Various methods of extrinsic doping, where molecules or atoms with large/small reduction potentials are blended with the semiconductor, can lead to dopant aggregation, migration, phase segregation, and morphology alteration. Self-doping overcomes these challenges by structurally linking the dopant directly to the organic semiconductor. However, for their practical incorporation into devices, self-doped organic materials must be cast into thin-films, yet processing methods to allow for the formation of continuous and uniform films have not been developed beyond simple drop-casting. Whilst self-doped organic molecules afford the remarkable ability to position dopants with molecular precision and control of attachment mode, their steric bulk inevitably disrupts the crystallization on surfaces. As such, there is great interest in the development of processing modalities that allow deposited molecules to converge to the thermodynamic minimum of a well-ordered and highly crystalline organic thin film instead of getting trapped in local disordered minima that represent metastable configurations. By contrasting drop casting, ultrasonic deposition, and physical vapor deposition, we investigate the free energy landscape of the crystallization of sterically hindered self-doped perylene diimide thin films. A clear relationship is established between processing conditions, the crystallinity and order within the deposited films, the dopant structures and the resulting spin density. We find physical vapor deposition to be a robust method capable of producing smooth, continuous, highly ordered self-doped organic small molecule thin-films with tailored spin concentrations and well-defined morphologies. 
    more » « less
  5. ABSTRACT: Molecular simulations with atomistic or coarse- 6 grained force fields are a powerful approach for understanding and 7 predicting the self-assembly phase behavior of complex molecules. 8 Amphiphiles, block oligomers, and block polymers can form 9 mesophases with different ordered morphologies describing the 10 spatial distribution of the blocks, but entirely amorphous nature for 11 local packing and chain conformation. Screening block oligomer 12 chemistry and architecture through molecular simulations to find 13 promising candidates for functional materials is aided by effective 14 and straightforward morphology identification techniques. Captur- 15 ing 3-dimensional periodic structures, such as ordered network 16 morphologies, is hampered by the requirement that the number of 17 molecules in the simulated system and the shape of the periodic simulation box need to be commensurate with those of the resulting 18 network phase. Common strategies for structure identification include structure factors and order parameters, but these fail to 19 identify imperfect structures in simulations with incorrect system sizes. Building upon pioneering work by DeFever et al. [Chem. Sci. 20 2019, 10, 7503−7515] who implemented a PointNet (i.e., a neural network designed for computer vision applications using point 21 clouds) to detect local structure in simulations of single-bead particles and water molecules, we present a PointNet for detection of 22 nonlocal ordered morphologies of complex block oligomers. Our PointNet was trained using atomic coordinates from molecular 23 dynamics simulation trajectories and synthetic point clouds for ordered network morphologies that were absent from previous 24 simulations. In contrast to prior work on simple molecules, we observe that large point clouds with 1000 or more points are needed 25 for the more complex block oligomers. The trained PointNet model achieves an accuracy as high as 0.99 for globally ordered 26 morphologies formed by linear diblock, linear triblock, and 3-arm and 4-arm star-block oligomers, and it also allows for the discovery 27 of emerging ordered patterns from nonequilibrium systems. 
    more » « less