skip to main content


Title: Machine Learning for Molecular Simulation
Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation.  more » « less
Award ID(s):
1900374 2019745
NSF-PAR ID:
10223159
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Annual Review of Physical Chemistry
Volume:
71
Issue:
1
ISSN:
0066-426X
Page Range / eLocation ID:
361 to 390
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Machine learning (ML) is transforming all areas of science.The complex and time-consuming calculations in molecular simulations are particularly suitable for an ML revolution and have already been profoundly affected by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, on coarse-grained molecular dynamics, on the extraction of free energy surfaces and kinetics, and on generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems,we review some important principles of molecular physics and describe how they can be incorporated into ML structures. Finally,we identify and describe a list of open challenges for the interface between ML and molecular simulation. 
    more » « less
  2. Abstract

    Cryo‐electron microscopy (cryo‐EM) has become a major experimental technique to determine the structures of large protein complexes and molecular assemblies, as evidenced by the 2017 Nobel Prize. Although cryo‐EM has been drastically improved to generate high‐resolution three‐dimensional maps that contain detailed structural information about macromolecules, the computational methods for using the data to automatically build structure models are lagging far behind. The traditional cryo‐EM model building approach is template‐based homology modeling. Manual de novo modeling is very time‐consuming when no template model is found in the database. In recent years, de novo cryo‐EM modeling using machine learning (ML) and deep learning (DL) has ranked among the top‐performing methods in macromolecular structure modeling. DL‐based de novo cryo‐EM modeling is an important application of artificial intelligence, with impressive results and great potential for the next generation of molecular biomedicine. Accordingly, we systematically review the representative ML/DL‐based de novo cryo‐EM modeling methods. Their significances are discussed from both practical and methodological viewpoints. We also briefly describe the background of cryo‐EM data processing workflow. Overall, this review provides an introductory guide to modern research on artificial intelligence for de novo molecular structure modeling and future directions in this emerging field.

    This article is categorized under:

    Structure and Mechanism > Molecular Structures

    Structure and Mechanism > Computational Biochemistry and Biophysics

    Data Science > Artificial Intelligence/Machine Learning

     
    more » « less
  3. Abstract

    Brownian dynamics (BD) is a computational method to simulate molecular diffusion processes. Although the BD method has been developed over several decades and is well established, new methodological developments are improving its accuracy, widening its scope, and increasing its application. In biological applications, BD is used to investigate the diffusive behavior of molecules subject to forces due to intermolecular interactions or interactions with material surfaces. BD can be used to compute rate constants for diffusional association, generate structures of encounter complexes for molecular binding partners, and examine the transport properties of geometrically complex molecules. Often, a series of simulations is performed, for example, for different protein mutants or environmental conditions, so that the effects of the changes on diffusional properties can be estimated. While biomolecules are commonly described at atomic resolution and internal molecular motions are typically neglected, coarse‐graining and the treatment of conformational flexibility are increasingly employed. Software packages for BD simulations of biomolecules are growing in capabilities, with several new packages providing novel features that expand the range of questions that can be addressed. These advances, when used in concert with experiment or other simulation methods, such as molecular dynamics, open new opportunities for application to biochemical and biological systems. Here, we review some of the latest developments in the theory, methods, software, and applications of BD simulations to study biomolecular diffusional association processes and provide a perspective on their future use and application to outstanding challenges in biology, bioengineering, and biomedicine.

    This article is categorized under:

    Structure and Mechanism > Computational Biochemistry and Biophysics

    Molecular and Statistical Mechanics > Molecular Dynamics and Monte‐Carlo Methods

    Software > Simulation Methods

     
    more » « less
  4. null (Ed.)
    In this review, we examine how machine learning (ML) can build on molecular simulation (MS) algorithms to advance tremendously our ability to predict the thermodynamic properties of a wide range of systems. The key thermodynamic properties that govern the evolution of a system and the outcome of a process include the entropy, the Helmholtz and the Gibbs free energy. However, their determination through advanced molecular simulation algorithms has remained challenging, since such methods are extremely computationally intensive. Combining MS with ML provides a solution that overcomes such challenges and, in turn, accelerates discovery through the rapid prediction of free energies. After presenting a brief overview of combined MS–ML protocols, we review how these approaches allow for the accurate prediction of these thermodynamic functions and, more broadly, of free energy landscapes for molecular and biological systems. We then discuss extensions of this approach to systems relevant to energy and environmental applications, i.e. gas storage and separation in nanoporous materials, such as metal–organic frameworks and covalent organic frameworks. We finally show in the last part of the review how ML models can suggest new ways to explore free energy landscapes, identify novel pathways and provide new insight into assembly processes. 
    more » « less
  5. Organic molecules and polymers have a broad range of applications in biomedical, chemical, and materials science fields. Traditional design approaches for organic molecules and polymers are mainly experimentally-driven, guided by experience, intuition, and conceptual insights. Though they have been successfully applied to discover many important materials, these methods are facing significant challenges due to the tremendous demand of new materials and vast design space of organic molecules and polymers. Accelerated and inverse materials design is an ideal solution to these challenges. With advancements in high-throughput computation, artificial intelligence (especially machining learning, ML), and the growth of materials databases, ML-assisted materials design is emerging as a promising tool to flourish breakthroughs in many areas of materials science and engineering. To date, using ML-assisted approaches, the quantitative structure property/activity relation for material property prediction can be established more accurately and efficiently. In addition, materials design can be revolutionized and accelerated much faster than ever, through ML-enabled molecular generation and inverse molecular design. In this perspective, we review the recent progresses in ML-guided design of organic molecules and polymers, highlight several successful examples, and examine future opportunities in biomedical, chemical, and materials science fields. We further discuss the relevant challenges to solve in order to fully realize the potential of ML-assisted materials design for organic molecules and polymers. In particular, this study summarizes publicly available materials databases, feature representations for organic molecules, open-source tools for feature generation, methods for molecular generation, and ML models for prediction of material properties, which serve as a tutorial for researchers who have little experience with ML before and want to apply ML for various applications. Last but not least, it draws insights into the current limitations of ML-guided design of organic molecules and polymers. We anticipate that ML-assisted materials design for organic molecules and polymers will be the driving force in the near future, to meet the tremendous demand of new materials with tailored properties in different fields. 
    more » « less