skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 31, 2026

Title: Consensus-based adaptive sampling and approximation for high-dimensional energy landscapes
We present a consensus-based framework that unifies phase space exploration with posterior-residual-based adaptive sampling for surrogate construction in high-dimensional energy landscapes. Unlike standard approximation tasks where sampling points can be freely queried, systems with complex energy landscapes such as molecular dynamics (MD) do not have direct access to arbitrary sampling regions due to the physical constraints and energy barriers; the surrogate construction further relies on the dynamical exploration of phase space, posing a significant numerical challenge. We formulate the problem as a minimax optimization that jointly adapts both the surrogate approximation and residual-enhanced sampling. The construction of free energy surfaces (FESs) for high-dimensional collective variables (CVs) of MD systems is used as a motivating example to illustrate the essential idea. Specifically, the maximization step establishes a stochastic interacting particle system to impose adaptive sampling through both exploitation of a Laplace approximation of the max-residual region and exploration of uncharted phase space via temperature control. The minimization step updates the FES surrogate with the new sample set. Numerical results demonstrate the effectiveness of the present approach for biomolecular systems with up to 30 CVs. While we focus on the FES construction, the developed framework is general for efficient surrogate construction for complex systems with high-dimensional energy landscapes.  more » « less
Award ID(s):
2143739
PAR ID:
10610893
Author(s) / Creator(s):
;
Publisher / Repository:
arXivorg
Date Published:
Journal Name:
arXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We propose a generative model-based framework for learning collective variables (CVs) that faithfully capture the individual metastable states of the fulldimensional molecular dynamics (MD) systems. Unlike most existing approaches based on various feature extraction strategies, the new framework transfers the exhausting efforts of feature selection into a generative task of reconstructing the full-dimensional probability density function (PDF) from a set of CVs under a prior distribution with pre-assigned local maxima. By pairing the CVs with a set of auxiliary Gaussian random variables, we seek an invertible mapping that recovers the full-dimensional PDF and meanwhile, preserves the correspondence between the metastable states of the MD space and individual local maxima of the prior distribution. Through identifying the metastable states within MD space that are generally unknown and imposing the correspondence between the two spaces, the constructed CVs retain clear physical interpretations and provide kinetic insight for the molecular systems on the collective scale. We demonstrate the effectiveness of the proposed method with the alanine dipeptide in the aqueous environment. The constructed CVs faithfully capture the essential metastable states of the full MD systems, which show good agreement with kinetic properties such as the transition from the ballistic to the plateau regime for the mean square displacement. 
    more » « less
  2. Abstract Collective variable (CV)‐based enhanced sampling techniques are widely used today for accelerating barrier‐crossing events in molecular simulations. A class of these methods, which includes temperature accelerated molecular dynamics (TAMD)/driven‐adiabatic free energy dynamics (d‐AFED), unified free energy dynamics (UFED), and temperature accelerated sliced sampling (TASS), uses an extended variable formalism to achieve quick exploration of conformational space. These techniques are powerful, as they enhance the sampling of a large number of CVs simultaneously compared to other techniques. Extended variables are kept at a much higher temperature than the physical temperature by ensuring adiabatic separation between the extended and physical subsystems and employing rigorous thermostatting. In this work, we present a computational platform to perform extended phase space enhanced sampling simulations using the open‐source molecular dynamics engine OpenMM. The implementation allows users to have interoperability of sampling techniques, as well as employ state‐of‐the‐art thermostats and multiple time‐stepping. This work also presents protocols for determining the critical parameters and procedures for reconstructing high‐dimensional free energy surfaces. As a demonstration, we present simulation results on the high dimensional conformational landscapes of the alanine tripeptide in vacuo, tetra‐N‐methylglycine (tetra‐sarcosine) peptoid in implicit solvent, and the Trp‐cage mini protein in explicit water. 
    more » « less
  3. Rapid computational exploration of the free energy landscape of biological molecules remains an active area of research due to the difficulty of sampling rare state transitions in molecular dynamics (MD) simulations. In recent years, an increasing number of studies have exploited machine learning (ML) models to enhance and analyze MD simulations. Notably, unsupervised models that extract kinetic information from a set of parallel trajectories have been proposed including the variational approach for Markov processes (VAMP), VAMPNets, and time-lagged variational autoencoders (TVAE). In this work, we propose a combination of adaptive sampling with active learning of kinetic models to accelerate the discovery of the conformational landscape of biomolecules. In particular, we introduce and compare several techniques that combine kinetic models with two adaptive sampling regimes (least counts and multiagent reinforcement learning- based adaptive sampling) to enhance the exploration of conformational ensembles without introducing biasing forces. Moreover, inspired by the active learning approach of uncertainty-based sampling, we also present MaxEnt VAMPNet. This technique consists of restarting simulations from the microstates that maximize the Shannon entropy of a VAMPNet trained to perform the soft discretization of metastable states. By running simulations on two test systems, the WLALL pentapeptide and the villin headpiece subdomain, we empirically demonstrate that MaxEnt VAMPNet results in faster exploration of conformational landscapes compared with the baseline and other proposed methods. 
    more » « less
  4. Abstract Molecular dynamics (MD) has served as a powerful tool for designing materials with reduced reliance on laboratory testing. However, the use of MD directly to treat the deformation and failure of materials at the mesoscale is still largely beyond reach. In this work, we propose a learning framework to extract a peridynamics model as a mesoscale continuum surrogate from MD simulated material fracture data sets. Firstly, we develop a novel coarse-graining method, to automatically handle the material fracture and its corresponding discontinuities in the MD displacement data sets. Inspired by the weighted essentially non-oscillatory (WENO) scheme, the key idea lies at an adaptive procedure to automatically choose the locally smoothest stencil, then reconstruct the coarse-grained material displacement field as the piecewise smooth solutions containing discontinuities. Then, based on the coarse-grained MD data, a two-phase optimization-based learning approach is proposed to infer the optimal peridynamics model with damage criterion. In the first phase, we identify the optimal nonlocal kernel function from the data sets without material damage to capture the material stiffness properties. Then, in the second phase, the material damage criterion is learnt as a smoothed step function from the data with fractures. As a result, a peridynamics surrogate is obtained. As a continuum model, our peridynamics surrogate model can be employed in further prediction tasks with different grid resolutions from training, and hence allows for substantial reductions in computational cost compared with MD. We illustrate the efficacy of the proposed approach with several numerical tests for the dynamic crack propagation problem in a single-layer graphene. Our tests show that the proposed data-driven model is robust and generalizable, in the sense that it is capable of modeling the initialization and growth of fractures under discretization and loading settings that are different from the ones used during training. 
    more » « less
  5. Molecular dynamics (MD) simulations are fundamental computational tools for the study of proteins and their free energy landscapes. However, sampling protein conformational changes through MD simulations is challenging due to the relatively long time scales of these processes. Many enhanced sampling approaches have emerged to tackle this problem, including biased sampling and path-sampling methods. In this Perspective, we focus on adaptive sampling algorithms. These techniques differ from other approaches because the thermodynamic ensemble is preserved and the sampling is enhanced solely by restarting MD trajectories at particularly chosen seeds rather than introducing biasing forces. We begin our treatment with an overview of theoretically transparent methods, where we discuss principles and guidelines for adaptive sampling. Then, we present a brief summary of select methods that have been applied to realistic systems in the past. Finally, we discuss recent advances in adaptive sampling methodology powered by deep learning techniques, as well as their shortcomings. 
    more » « less