skip to main content


Title: Machine-Learning-Assisted De Novo Design of Organic Molecules and Polymers: Opportunities and Challenges
Organic molecules and polymers have a broad range of applications in biomedical, chemical, and materials science fields. Traditional design approaches for organic molecules and polymers are mainly experimentally-driven, guided by experience, intuition, and conceptual insights. Though they have been successfully applied to discover many important materials, these methods are facing significant challenges due to the tremendous demand of new materials and vast design space of organic molecules and polymers. Accelerated and inverse materials design is an ideal solution to these challenges. With advancements in high-throughput computation, artificial intelligence (especially machining learning, ML), and the growth of materials databases, ML-assisted materials design is emerging as a promising tool to flourish breakthroughs in many areas of materials science and engineering. To date, using ML-assisted approaches, the quantitative structure property/activity relation for material property prediction can be established more accurately and efficiently. In addition, materials design can be revolutionized and accelerated much faster than ever, through ML-enabled molecular generation and inverse molecular design. In this perspective, we review the recent progresses in ML-guided design of organic molecules and polymers, highlight several successful examples, and examine future opportunities in biomedical, chemical, and materials science fields. We further discuss the relevant challenges to solve in order to fully realize the potential of ML-assisted materials design for organic molecules and polymers. In particular, this study summarizes publicly available materials databases, feature representations for organic molecules, open-source tools for feature generation, methods for molecular generation, and ML models for prediction of material properties, which serve as a tutorial for researchers who have little experience with ML before and want to apply ML for various applications. Last but not least, it draws insights into the current limitations of ML-guided design of organic molecules and polymers. We anticipate that ML-assisted materials design for organic molecules and polymers will be the driving force in the near future, to meet the tremendous demand of new materials with tailored properties in different fields.  more » « less
Award ID(s):
1818574 1729452 1934829 1762661
PAR ID:
10187037
Author(s) / Creator(s):
; ; ; ; ; ; ;
Date Published:
Journal Name:
Polymers
Volume:
12
Issue:
1
ISSN:
2073-4360
Page Range / eLocation ID:
163
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Optical technologies in the long‐wave infrared (LWIR) spectrum (7–14 μm) offer important advantages for high‐resolution thermal imaging in near or complete darkness. The use of polymeric transmissive materials for IR imaging offers numerous cost and processing advantages but suffers from inferior optical properties in the LWIR spectrum. A major challenge in the design of LWIR‐transparent organic materials is that nearly all organic molecules absorb in this spectral window which lies within the so‐called IR‐fingerprint region. We report on a new molecular‐design approach to prepare high refractive index polymers with enhanced LWIR transparency. Computational methods were used to accelerate the design of novel molecules and polymers. Using this approach, we have prepared chalcogenide hybrid inorganic/organic polymers (CHIPs) with enhanced LWIR transparency and thermomechanical properties via inverse vulcanization of elemental sulfur with new organic co‐monomers.

     
    more » « less
  2. Abstract

    Optical technologies in the long‐wave infrared (LWIR) spectrum (7–14 μm) offer important advantages for high‐resolution thermal imaging in near or complete darkness. The use of polymeric transmissive materials for IR imaging offers numerous cost and processing advantages but suffers from inferior optical properties in the LWIR spectrum. A major challenge in the design of LWIR‐transparent organic materials is that nearly all organic molecules absorb in this spectral window which lies within the so‐called IR‐fingerprint region. We report on a new molecular‐design approach to prepare high refractive index polymers with enhanced LWIR transparency. Computational methods were used to accelerate the design of novel molecules and polymers. Using this approach, we have prepared chalcogenide hybrid inorganic/organic polymers (CHIPs) with enhanced LWIR transparency and thermomechanical properties via inverse vulcanization of elemental sulfur with new organic co‐monomers.

     
    more » « less
  3. null (Ed.)
    Study of the permeability of small organic molecules across lipid membranes plays a significant role in designing potential drugs in the field of drug discovery. Approaches to design promising drug molecules have gone through many stages, from experiment-based trail-and-error approaches, to the well-established avenue of the quantitative structure–activity relationship, and currently to the stage guided by machine learning (ML) and artificial intelligence techniques. In this work, we present a study of the permeability of small drug-like molecules across lipid membranes by two types of ML models, namely the least absolute shrinkage and selection operator (LASSO) and deep neural network (DNN) models. Molecular descriptors and fingerprints are used for featurization of organic molecules. Using molecular descriptors, the LASSO model uncovers that the electro-topological, electrostatic, polarizability, and hydrophobicity/hydrophilicity properties are the most important physical properties to determine the membrane permeability of small drug-like molecules. Additionally, with molecular fingerprints, the LASSO model suggests that certain chemical substructures can significantly affect the permeability of organic molecules, which closely connects to the identified main physical properties. Moreover, the DNN model using molecular fingerprints can help develop a more accurate mapping between molecular structures and their membrane permeability than LASSO models. Our results provide deep understanding of drug–membrane interactions and useful guidance for the inverse molecular design of drug-like molecules. Last but not least, while the current focus is on the permeability of drug-like molecules, the methodology of this work is general and can be applied for other complex physical chemistry problems to gain molecular insights. 
    more » « less
  4. null (Ed.)
    The ever-increasing demand for novel polymers with superior properties requires a deeper understanding and exploration of the chemical space. Recently, data-driven approaches to explore the chemical space for polymer design have emerged. Among them, inverse design strategies for designing polymers with specific properties have evolved to be a significant materials informatics platform by learning hidden knowledge from materials data as well as smartly navigating the chemical space in an optimized way. In this review, we first summarize the progress in the representation of polymers, a prerequisite step for the inverse design of polymers. Then, we systematically introduce three data-driven strategies implemented for the inverse design of polymers, i.e. , high-throughput virtual screening, global optimization, and generative models. Finally, we discuss the challenges and opportunities of the data-driven strategies as well as optimization algorithms employed in the inverse design of polymers. 
    more » « less
  5. The rapid development and application of machine learning (ML) techniques in materials science have led to new tools for machine-enabled and autonomous/high-throughput materials design and discovery. Alongside, efforts to extract data from traditional experiments in the published literature with natural language processing (NLP) algorithms provide opportunities to develop tremendous data troves for these in silico design and discovery endeavors. While NLP is used in all aspects of society, its application in materials science is still in the very early stages. This perspective provides a case study on the application of NLP to extract information related to the preparation of organic materials. We present the case study at a basic level with the aim to discuss these technologies and processes with researchers from diverse scientific backgrounds. We also discuss the challenges faced in the case study and provide an assessment to improve the accuracy of NLP techniques for materials science with the aid of community contributions. 
    more » « less