skip to main content


Title: MLMOD: Machine Learning Methods for Data-Driven Modeling in LAMMPS
MLMOD is a software package for incorporating machine learning approaches and models into simulations of microscale mechanics and molecular dynamics in LAMMPS. Recent machine learning approaches provide promising data-driven approaches for learning representations for system behaviors from experimental data and high fidelity simulations. The package facilitates learning and using data-driven models for (i) dynamics of the system at larger spatial-temporal scales (ii) interactions between system components, (iii) features yielding coarser degrees of freedom, and (iv) features for new quantities of interest characterizing system behaviors. MLMOD provides hooks in LAMMPS for (i) modeling dynamics and time-step integration, (ii) modeling interactions, and (iii) computing quantities of interest characterizing system states. The package allows for use of machine learning methods with general model classes including Neural Networks, Gaussian Process Regression, Kernel Models, and other approaches. Here we discuss our prototype C++/Python package, aims, and example usage. The package is integrated currently with the mesocale and molecular dynamics simulation package LAMMPS and PyTorch.  more » « less
Award ID(s):
2306101
NSF-PAR ID:
10539869
Author(s) / Creator(s):
Publisher / Repository:
Journal of Open Source Software
Date Published:
Journal Name:
Journal of Open Source Software
Volume:
8
Issue:
89
ISSN:
2475-9066
Page Range / eLocation ID:
5620
Subject(s) / Keyword(s):
Machine Learning AI Data-Driven Analysis Simulation Open Source Software Soft Materials Computational Physics LAMMPS Fluid-Structure Interaction Stochastic Eulerian Lagrangian Methods (SELMs) Molecular Dynamics Complex Fluids Rheology Coarse-Grained Simulations Colloids Membranes Polymers
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Summary

    Molecular dynamics simulations have found use in a wide variety of biomolecular applications, from protein folding kinetics to computational drug design to refinement of molecular structures. Two areas where users and developers frequently need to extend the built-in capabilities of most software packages are implementing custom interactions, for instance biases derived from experimental data, and running ensembles of simulations. We present a Python high-level interface for the popular simulation package GROMACS that i) allows custom potential functions without modifying the simulation package code, ii) maintains the optimized performance of GROMACS and iii) presents an abstract interface to building and executing computational graphs that allows transparent low-level optimization of data flow and task placement. Minimal dependencies make this integrated API for the GROMACS simulation engine simple, portable and maintainable. We demonstrate this API for experimentally-driven refinement of protein conformational ensembles.

    Availability and implementation

    LGPLv2.1 source and instructions are available at https://github.com/kassonlab/gmxapi.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  2. Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous all-atom molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous all-atom simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce all-atom molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates. 
    more » « less
  3. Abstract We provide an atomic-level description of the structure and dynamics of the UUCG RNA stem–loop by combining molecular dynamics simulations with experimental data. The integration of simulations with exact nuclear Overhauser enhancements data allowed us to characterize two distinct states of this molecule. The most stable conformation corresponds to the consensus three-dimensional structure. The second state is characterized by the absence of the peculiar non-Watson–Crick interactions in the loop region. By using machine learning techniques we identify a set of experimental measurements that are most sensitive to the presence of non-native states. We find that although our MD ensemble, as well as the consensus UUCG tetraloop structures, are in good agreement with experiments, there are remaining discrepancies. Together, our results show that (i) the MD simulation overstabilize a non-native loop conformation, (ii) eNOE data support its presence with a population of ≈10% and (iii) the structural interpretation of experimental data for dynamic RNAs is highly complex, even for a simple model system such as the UUCG tetraloop. 
    more » « less
  4. Abstract

    We develop a machine learning tool useful for predicting the instantaneous dynamical state of sub-monomer features within long linear polymer chains, as well as extracting the dominant macromolecular motions associated with sub-monomer behaviors of interest. We employ the tool to better understand and predict sub-monomer A2 domain unfolding dynamics occurring amidst the dominant large-scale macromolecular motions of the biopolymer von Willebrand Factor (vWF) immersed in flow. Results of coarse-grained Molecular Dynamics (MD) simulations of non-grafted vWF multimers subject to a shearing flow were used as input variables to a Random Forest Algorithm (RFA). Twenty unique features characterizing macromolecular conformation information of vWF multimers were used for training the RFA. The corresponding responses classify instantaneous A2 domain state as either folded or unfolded, and were directly taken from coarse-grained MD simulations. Three separate RFAs were trained using feature/response data of varying resolution, which provided deep insights into the highly correlated macromolecular dynamics occurring in concert with A2 domain unfolding events. The algorithm is used to analyze results of simulation, but has been developed for use with experimental data as well.

     
    more » « less
  5. DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current version of DeePMD-kit offers numerous advanced features, such as DeepPot-SE, attention-based and hybrid descriptors, the ability to fit tensile properties, type embedding, model deviation, DP-range correction, DP long range, graphics processing unit support for customized operators, model compression, non-von Neumann molecular dynamics, and improved usability, including documentation, compiled binary packages, graphical user interfaces, and application programming interfaces. This article presents an overview of the current major version of the DeePMD-kit package, highlighting its features and technical details. Additionally, this article presents a comprehensive procedure for conducting molecular dynamics as a representative application, benchmarks the accuracy and efficiency of different models, and discusses ongoing developments.

     
    more » « less