skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: SGOOP-d: Estimating kinetic distances and reaction coordinate dimensionality for rare event systems from biased/unbiased simulations
Understanding kinetics including reaction pathways and associated transition rates is an important yet difficult problem in numerous chemical and biological systems especially in situations with multiple competing pathways. When these high-dimensional systems are projected on low-dimensional coordinates often needed for enhanced sampling or for interpretation of simulations and experiments, one often ends up losing the kinetic connectivity of the underlying high-dimensional landscape. Thus in the low-dimensional projection metastable states might appear closer or further than they actually are. To deal with this issue, in this work we develop a formalism that learns a multi-dimensional yet minimally complex reaction coordinate (RC) for generic high-dimensional systems, such that when projected along this RC, all possible kinetically relevant pathways can be demarcated and the true high-dimensional connectivity is maintained. One of the defining attributes of our method lies in that it can work on long unbiased simulations as well as biased simulations often needed for rare event systems. We demonstrate the utility of the method by studying a range of model systems including conformational transitions in a small peptide Ace-Ala3-Nme, where we show how SGOOPderived two-dimensional and three-dimensional reaction coordinate can capture the kinetics for 23 and all 28 out of the 28 dominant state-to-state-transitions respectively.  more » « less
Award ID(s):
1632976
PAR ID:
10287257
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
ArXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Determining collective variables (CVs) for conformational transitions is crucial to understanding their dynamics and targeting them in enhanced sampling simulations. Often, CVs are proposed based on intuition or prior knowledge of a system. However, the problem of systematically determining a proper reaction coordinate (RC) for a specific process in terms of a set of putative CVs can be achieved using committor analysis (CA). Identifying essential degrees of freedom that govern such transitions using CA remains elusive because of the high dimensionality of the conformational space. Various schemes exist to leverage the power of machine learning (ML) to extract an RC from CA. Here, we extend these studies and compare the ability of 17 different ML schemes to identify accurate RCs associated with conformational transitions. We tested these methods on an alanine dipeptide in vacuum and on a sarcosine dipeptoid in an implicit solvent. Our comparison revealed that the light gradient boosting machine method outperforms other methods. In order to extract key features from the models, we employed Shapley Additive exPlanations analysis and compared its interpretation with the “feature importance” approach. For the alanine dipeptide, our methodology identifies ϕ and θ dihedrals as essential degrees of freedom in the C7ax to C7eq transition. For the sarcosine dipeptoid system, the dihedrals ψ and ω are the most important for the cisαD to transαD transition. We further argue that analysis of the full dynamical pathway, and not just endpoint states, is essential for identifying key degrees of freedom governing transitions. 
    more » « less
  2. null (Ed.)
    Base flipping is a key biophysical event involved in recognition of various ligands by ribonucleic acid (RNA) molecules. However, the mechanism of base flipping in RNA remains poorly understood, in part due to the lack of atomistic details on complex rearrangements in neighboring bases. In this work, we applied transition path sampling (TPS) methods to study base flipping in a double-stranded RNA (dsRNA) molecule that is known to interact with RNA-editing enzymes through this mechanism. We obtained an ensemble of 1000 transition trajectories to describe the base-flipping process. We used the likelihood maximization method to determine the refined reaction coordinate (RC) consisting of two collective variables (CVs), a distance and a dihedral angle between nucleotides that form stacking interactions with the flipping base. The free energy profile projected along the refined RC revealed three minima, two corresponding to the initial and final states and one for a metastable state. We suggest that the metastable state likely represents a wobbled conformation of nucleobases observed in NMR studies that is often characterized as the flipped state. The analyses of reactive trajectories further revealed that the base flipping is coupled to a global conformational change in a stem-loop of dsRNA. 
    more » « less
  3. Molecular dynamics (MD) simulations generate valuable all-atom resolution trajectories of complex systems, but analyzing this high-dimensional data as well as reaching practical timescales, even with powerful supercomputers, remain open problems. As such, many specialized sampling and reaction coordinate construction methods exist that alleviate these problems. However, these methods typically don't work directly on all atomic coordinates, and still require previous knowledge of the important distinguishing features of the system, known as order parameters (OPs). Here we present AMINO, an automated method that generates such OPs by screening through a very large dictionary of OPs, such as all heavy atom contacts in a biomolecule. AMINO uses ideas from information theory to learn OPs that can then serve as an input for designing a reaction coordinate which can then be used in many enhanced sampling methods. Here we outline its key theoretical underpinnings, and apply it to systems of increasing complexity. Our applications include a problem of tremendous pharmaceutical and engineering relevance, namely, calculating the binding affinity of a protein–ligand system when all that is known is the structure of the bound system. Our calculations are performed in a human-free fashion, obtaining very accurate results compared to long unbiased MD simulations on the Anton supercomputer, but in orders of magnitude less computer time. We thus expect AMINO to be useful for the calculation of thermodynamics and kinetics in the study of diverse molecular systems. 
    more » « less
  4. Many chemical reactions and molecular processes occur on time scales that are significantly longer than those accessible by direct simulations. One successful approach to estimating dynamical statistics for such processes is to use many short time series of observations of the system to construct a Markov state model, which approximates the dynamics of the system as memoryless transitions between a set of discrete states. The dynamical Galerkin approximation (DGA) is a closely related framework for estimating dynamical statistics, such as committors and mean first passage times, by approximating solutions to their equations with a projection onto a basis. Because the projected dynamics are generally not memoryless, the Markov approximation can result in significant systematic errors. Inspired by quasi-Markov state models, which employ the generalized master equation to encode memory resulting from the projection, we reformulate DGA to account for memory and analyze its performance on two systems: a two-dimensional triple well and the AIB9 peptide. We demonstrate that our method is robust to the choice of basis and can decrease the time series length required to obtain accurate kinetics by an order of magnitude. 
    more » « less
  5. Many complex dynamical systems in the real world, including ecological, climate, financial and power-grid systems, often show critical transitions, or tipping points, in which the system’s dynamics suddenly transit into a qualitatively different state. In mathematical models, tipping points happen as a control parameter gradually changes and crosses a certain threshold. Tipping elements in such systems may interact with each other as a network, and understanding the behaviour of interacting tipping elements is a challenge because of the high dimensionality originating from the network. Here, we develop a degree-based mean-field theory for a prototypical double-well system coupled on a network with the aim of understanding coupled tipping dynamics with a low-dimensional description. The method approximates both the onset of the tipping point and the position of equilibria with a reasonable accuracy. Based on the developed theory and numerical simulations, we also provide evidence for multistage tipping point transitions in networks of double-well systems. 
    more » « less