skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Machine Learning for Molecular Simulation
Machine learning (ML) is transforming all areas of science.The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems,we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally,we identify and describe a list of open challenges for the interface between ML and molecular simulation.  more » « less
Award ID(s):
1900374
PAR ID:
10148665
Author(s) / Creator(s):
Date Published:
Journal Name:
Annual review of physical chemistry
ISSN:
0066-426X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation. 
    more » « less
  2. null (Ed.)
    In this review, we examine how machine learning (ML) can build on molecular simulation (MS) algorithms to advance tremendously our ability to predict the thermodynamic properties of a wide range of systems. The key thermodynamic properties that govern the evolution of a system and the outcome of a process include the entropy, the Helmholtz and the Gibbs free energy. However, their determination through advanced molecular simulation algorithms has remained challenging, since such methods are extremely computationally intensive. Combining MS with ML provides a solution that overcomes such challenges and, in turn, accelerates discovery through the rapid prediction of free energies. After presenting a brief overview of combined MS–ML protocols, we review how these approaches allow for the accurate prediction of these thermodynamic functions and, more broadly, of free energy landscapes for molecular and biological systems. We then discuss extensions of this approach to systems relevant to energy and environmental applications, i.e. gas storage and separation in nanoporous materials, such as metal–organic frameworks and covalent organic frameworks. We finally show in the last part of the review how ML models can suggest new ways to explore free energy landscapes, identify novel pathways and provide new insight into assembly processes. 
    more » « less
  3. Abstract Cryo‐electron microscopy (cryo‐EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo‐EM has been drastically improved to generate high‐resolution three‐dimensional maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo‐EM model building approach is template‐based homology modeling. Manual de novo modeling is very time‐consuming when no template model is found in the database. In recent years, de novo cryo‐EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top‐performing methods in macromolecular structure modeling. DL‐based de novo cryo‐EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL‐based de novo cryo‐EM modeling methods. Their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo‐EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence for de novo molecular structure modeling and future directions in this emerging field. This article is categorized under:Structure and Mechanism > Molecular StructuresStructure and Mechanism > Computational Biochemistry and BiophysicsData Science > Artificial Intelligence/Machine Learning 
    more » « less
  4. Tandem mass spectrometry (MS/MS) is crucial for small-molecule analysis; however, traditional computational methods are limited by incomplete reference libraries and complex data processing. Machine learning (ML) is transforming small-molecule mass spectrometry in three key directions: (a) predicting MS/MS spectra and related physicochemical properties to expand reference libraries, (b) improving spectral matching through automated pattern extraction, and (c) predicting molecular structures of compounds directly from their MS/MS spectra. We review ML approaches for molecular representations [descriptors, simplified molecular-input line-entry (SMILE) strings, and graphs] and MS/MS spectra representations (using binned vectors and peak lists) along with recent advances in spectra prediction, retention time, collision cross sections, and spectral matching. Finally, we discuss ML-integrated workflows for chemical formula identification. By addressing the limitations of current methods for compound identification, these ML approaches can greatly enhance the understanding of biological processes and the development of diagnostic and therapeutic tools. 
    more » « less
  5. Simulating the dynamics of ions near polarizable nanoparticles (NPs) using coarse-grained models is extremely challenging due to the need to solve the Poisson equation at every simulation timestep. Recently, a molecular dynamics (MD) method based on a dynamical optimization framework bypassed this obstacle by representing the polarization charge density as virtual dynamic variables and evolving them in parallel with the physical dynamics of ions. We highlight the computational gains accessible with the integration of machine learning (ML) methods for parameter prediction in MD simulations by demonstrating how they were realized in MD simulations of ions near polarizable NPs. An artificial neural network–based regression model was integrated with MD simulation and predicted the optimal simulation timestep and optimization parameters characterizing the virtual system with 94.3% success. The ML-enabled auto-tuning of parameters generated accurate dynamics of ions for ≈ 10 million steps while improving the stability of the simulation by over an order of magnitude. The integration of ML-enhanced framework with hybrid Open Multi-Processing / Message Passing Interface (OpenMP/MPI) parallelization techniques reduced the computational time of simulating systems with thousands of ions and induced charges from thousands of hours to tens of hours, yielding a maximum speedup of ≈ 3 from ML-only acceleration and a maximum speedup of ≈ 600 from the combination of ML and parallel computing methods. Extraction of ionic structure in concentrated electrolytes near oil–water emulsions demonstrates the success of the method. The approach can be generalized to select optimal parameters in other MD applications and energy minimization problems. 
    more » « less