skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Predicting the stereoselectivity of chemical reactions by composite machine learning method
Abstract Stereoselective reactions have played a vital role in the emergence of life, evolution, human biology, and medicine. However, for a long time, most industrial and academic efforts followed a trial-and-error approach for asymmetric synthesis in stereoselective reactions. In addition, most previous studies have been qualitatively focused on the influence of steric and electronic effects on stereoselective reactions. Therefore, quantitatively understanding the stereoselectivity of a given chemical reaction is extremely difficult. As proof of principle, this paper develops a novel composite machine learning method for quantitatively predicting the enantioselectivity representing the degree to which one enantiomer is preferentially produced from the reactions. Specifically, machine learning methods that are widely used in data analytics, including Random Forest, Support Vector Regression, and LASSO, are utilized. In addition, the Bayesian optimization and permutation importance tests are provided for an in-depth understanding of reactions and accurate prediction. Finally, the proposed composite method approximates the key features of the available reactions by using Gaussian mixture models, which provide suitable machine learning methods for new reactions. The case studies using the real stereoselective reactions show that the proposed method is effective and provides a solid foundation for further application to other chemical reactions.  more » « less
Award ID(s):
1933525
PAR ID:
10609466
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
Scientific Reports
Date Published:
Journal Name:
Scientific Reports
Volume:
14
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the increasing popularity of machine-learning (ML) applications, the demand for explainable artificial intelligence (XAI) techniques to explain ML models developed for computational chemistry has also emerged. In this study, we present the development of the Boltzmann-weighted Cumulative Integrated Gradients (BCIG) approach for effective explanation of mechanistic insights into ML models trained on high-level quantum mechanical and molecular mechanical (QM/MM) minimum energy pathways (MEPs). Using the acylation reactions of the Toho-1 β-lactamases and two antibiotic molecules (ampicillin and cefalexin) as the model systems, we show that the BCIG approach could quantitatively attribute the energetic contribution in one system, and the relative reactivity of individual steps across different systems to specific chemical processes such as the bond making/breaking and proton transfers. The proposed BCIG contribution attribution method quantifies chemist-interpretable insights in terms of contributions from each elementary chemical process, which are in agreement with the validating QM/MM calculations and our intuitive mechanistic understandings of the model reactions. 
    more » « less
  2. Abstract Despite significant advances in reconstructing genome-scale metabolic networks, the understanding of cellular metabolism remains incomplete for many organisms. A promising approach for elucidating cellular metabolism is analysing the full scope of enzyme promiscuity, which exploits the capacity of enzymes to bind to non-annotated substrates and generate novel reactions. To guide time-consuming costly experimentation, different computational methods have been proposed for exploring enzyme promiscuity. One relevant algorithm is PROXIMAL, which strongly relies on KEGG to define generic reaction rules and link specific molecular substructures with associated chemical transformations. Here, we present a completely new pipeline, PROXIMAL2, which overcomes the dependency on KEGG data. In addition, PROXIMAL2 introduces two relevant improvements with respect to the former version: i) correct treatment of multi-step reactions and ii) tracking of electric charges in the transformations. We compare PROXIMAL and PROXIMAL2 in recovering annotated products from substrates in KEGG reactions, finding a highly significant improvement in the level of accuracy. We then applied PROXIMAL2 to predict degradation reactions of phenolic compounds in the human gut microbiota. The results were compared to RetroPath RL, a different and relevant enzyme promiscuity method. We found a significant overlap between these two methods but also complementary results, which open new research directions into this relevant question in nutrition. 
    more » « less
  3. Abstract Detailed studies of interfacial gas-phase chemical reactions are important for understanding factors that control materials synthesis and environmental conditions that govern materials performance and degradation. Out of the many materials characterization methods that are available for interpreting gas–solid reaction processes,in situandoperandotransmission electron microscopy (TEM) is perhaps the most versatile, multimodal materials characterization technique. It has successfully been utilized to study interfacial gas–solid interactions under a wide range of environmental conditions, such as gas composition, humidity, pressure, and temperature. This stems from decades of R&D that permit controlled gas delivery and the ability to maintain a gaseous environment directly within the TEM column itself or through specialized side-entry gas-cell holders. Combined with capabilities for real-time, high spatial resolution imaging, electron diffraction and spectroscopy, dynamic structural and chemical changes can be investigated to determine fundamental reaction mechanisms and kinetics that occur at site-specific interfaces. This issue ofMRS Bulletincovers research in this field ranging from technique development to the utilization of gas-phase microscopy methods that have been used to develop an improved understanding of multilength-scaled processes incurred during materials synthesis, catalytic reactions, and environmental exposure effects on materials properties. Graphical abstract 
    more » « less
  4. Abstract Cyclobutanes are prominent structural components in natural products and drug molecules. With the advent of strain‐release‐driven synthesis, ring‐opening reactions of bicyclo[1.1.0]butanes (BCBs) provide an attractive pathway to construct these three‐dimensional structures. However, the stereoselective difunctionalization of the central C−C σ‐bonds remains challenging. Reported herein is a covalent‐based organocatalytic strategy that exploits radical NHC catalysis to achieve diastereoselective acylfluoroalkylation of BCBs under mild conditions. The Breslow enolate acts as a single electron donor and provides an NHC‐bound ketyl radical with appropriate steric hindrance, which effectively distinguishes between the two faces of transient cyclobutyl radicals. This operationally simple method tolerates various fluoroalkyl reagents and common functional groups, providing a straightforward access to polysubstituted cyclobutanes (75 examples, up to >19 : 1 d.r.). The combined experimental and theoretical investigations of this organocatalytic system confirm the formation of the NHC‐derived radical and provide an understanding of how stereoselective radical‐radical coupling occurs. 
    more » « less
  5. Abstract Infrared thermography is a non-destructive technique that can be exploited in many fields including polymer composite investigation. Based on emissivity and thermal diffusivity variation; components, defects, and curing state of the composite can be identified. However, manual processing of thermal images that may contain significant artifacts, is prone to erroneous component and property determination. In this study, thermal images of different graphite/graphene-based polymer composites fabricated by hand, planetary, and batch mixing techniques were analyzed through an automatic machine learning model. Filler size, shape, and location can be identified in polymer composites and thus, the dispersion of different samples was quantified with a resolution of ~ 20 µm despite having artifacts in the thermal image. Thermal diffusivity comparison of three mixing techniques was performed for 40% graphite in the elastomer. Batch mixing demonstrated superior dispersion than planetary and hand mixing as the dispersion index (DI) for batch mixing was 0.07 while planetary and hand mixing showed 0.0865 and 0.163 respectively. Curing was investigated for a polymer with different fillers (PDMS took 500 s while PDMS-Graphene and PDMS Graphite Powder took 800 s to cure), and a thermal characteristic curve was generated to compare the composite quality. Therefore, the above-mentioned methods with machine learning algorithms can be a great tool to analyze composite both quantitatively and qualitatively. 
    more » « less