We propose a generative model-based framework for learning collective variables (CVs) that faithfully capture the individual metastable states of the fulldimensional molecular dynamics (MD) systems. Unlike most existing approaches based on various feature extraction strategies, the new framework transfers the exhausting efforts of feature selection into a generative task of reconstructing the full-dimensional probability density function (PDF) from a set of CVs under a prior distribution with pre-assigned local maxima. By pairing the CVs with a set of auxiliary Gaussian random variables, we seek an invertible mapping that recovers the full-dimensional PDF and meanwhile, preserves the correspondence between the metastable states of the MD space and individual local maxima of the prior distribution. Through identifying the metastable states within MD space that are generally unknown and imposing the correspondence between the two spaces, the constructed CVs retain clear physical interpretations and provide kinetic insight for the molecular systems on the collective scale. We demonstrate the effectiveness of the proposed method with the alanine dipeptide in the aqueous environment. The constructed CVs faithfully capture the essential metastable states of the full MD systems, which show good agreement with kinetic properties such as the transition from the ballistic to the plateau regime for the mean square displacement.
more »
« less
An exploration of machine learning models for the determination of reaction coordinates associated with conformational transitions
Determining collective variables (CVs) for conformational transitions is crucial to understanding their dynamics and targeting them in enhanced sampling simulations. Often, CVs are proposed based on intuition or prior knowledge of a system. However, the problem of systematically determining a proper reaction coordinate (RC) for a specific process in terms of a set of putative CVs can be achieved using committor analysis (CA). Identifying essential degrees of freedom that govern such transitions using CA remains elusive because of the high dimensionality of the conformational space. Various schemes exist to leverage the power of machine learning (ML) to extract an RC from CA. Here, we extend these studies and compare the ability of 17 different ML schemes to identify accurate RCs associated with conformational transitions. We tested these methods on an alanine dipeptide in vacuum and on a sarcosine dipeptoid in an implicit solvent. Our comparison revealed that the light gradient boosting machine method outperforms other methods. In order to extract key features from the models, we employed Shapley Additive exPlanations analysis and compared its interpretation with the “feature importance” approach. For the alanine dipeptide, our methodology identifies ϕ and θ dihedrals as essential degrees of freedom in the C7ax to C7eq transition. For the sarcosine dipeptoid system, the dihedrals ψ and ω are the most important for the cisαD to transαD transition. We further argue that analysis of the full dynamical pathway, and not just endpoint states, is essential for identifying key degrees of freedom governing transitions.
more »
« less
- Award ID(s):
- 1955381
- PAR ID:
- 10528889
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- The Journal of Chemical Physics
- Volume:
- 159
- Issue:
- 3
- ISSN:
- 0021-9606
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Shea, Joan-Emma (Ed.)We aim to automatize the identification of collective variables to simplify and speed up enhanced sampling simulations of conformational dynamics in biomolecules. We focus on anharmonic low-frequency vibrations that exhibit fluctuations on timescales faster than conformational transitions but describe a path of least resistance towards structural change. A key challenge is that harmonic approximations are ill-suited to characterize these vibrations, which are observed at far-infrared frequencies and are easily excited by thermal collisions at room temperature. Here, we approached this problem with a frequency-selective anharmonic (FRESEAN) mode analysis that does not rely on harmonic approximations and successfully isolates anharmonic low-frequency vibrations from short molecular dynamics simulation trajectories. We applied FRESEAN mode analysis to simulations of alanine dipeptide, a common test system for enhanced sampling simulation protocols, and compare the performance of isolated low-frequency vibrations to conventional user-defined collective variables (here backbone dihedral angles) in enhanced sampling simulations. The comparison shows that enhanced sampling along anharmonic low-frequency vibrations not only reproduces known conformational dynamics but can even further improve sampling of slow transitions compared to user-defined collective variables. Notably, free energy surfaces spanned by low-frequency anharmonic vibrational modes exhibit lower barriers associated with conformational transitions relative to representations in backbone dihedral space. We thus conclude that anharmonic low-frequency vibrations provide a promising path for highly effective and fully automated enhanced sampling simulations of conformational dynamics in biomolecules.more » « less
-
The study of phenomena such as protein folding and conformational changes in molecules is a central theme in chemical physics. Molecular dynamics (MD) simulation is the primary tool for the study of transition processes in biomolecules, but it is hampered by a huge timescale gap between the processes of interest and atomic vibrations that dictate the time step size. Therefore, it is imperative to combine MD simulations with other techniques in order to quantify the transition processes taking place on large timescales. In this work, the diffusion map with Mahalanobis kernel, a meshless approach for approximating the Backward Kolmogorov Operator (BKO) in collective variables, is upgraded to incorporate standard enhanced sampling techniques, such as metadynamics. The resulting algorithm, which we call the target measure Mahalanobis diffusion map (tm-mmap), is suitable for a moderate number of collective variables in which one can approximate the diffusion tensor and free energy. Imposing appropriate boundary conditions allows use of the approximated BKO to solve for the committor function and utilization of transition path theory to find the reactive current delineating the transition channels and the transition rate. The proposed algorithm, tm-mmap, is tested on the two-dimensional Moro–Cardin two-well system with position-dependent diffusion coefficient and on alanine dipeptide in two collective variables where the committor, the reactive current, and the transition rate are compared to those computed by the finite element method (FEM). Finally, tm-mmap is applied to alanine dipeptide in four collective variables where the use of finite elements is infeasible.more » « less
-
Abstract Mott metal–insulator transitions possess electronic, magnetic, and structural degrees of freedom promising next‐generation energy‐efficient electronics. A previously unknown, hierarchically ordered, and anisotropic supercrystal state is reported and its intrinsic formation characterized in‐situ during a Mott transition in a Ca2RuO4thin film. Machine learning‐assisted X‐ray nanodiffraction together with cryogenic electron microscopy reveal multi‐scale periodic domain formation at and below the film transition temperature (TFilm ≈ 200–250 K) and a separate anisotropic spatial structure at and aboveTFilm. Local resistivity measurements imply an intrinsic coupling of the supercrystal orientation to the material's anisotropic conductivity. These findings add a new degree of complexity to the physical understanding of Mott transitions, opening opportunities for designing materials with tunable electronic properties.more » « less
-
null (Ed.)Base flipping is a key biophysical event involved in recognition of various ligands by ribonucleic acid (RNA) molecules. However, the mechanism of base flipping in RNA remains poorly understood, in part due to the lack of atomistic details on complex rearrangements in neighboring bases. In this work, we applied transition path sampling (TPS) methods to study base flipping in a double-stranded RNA (dsRNA) molecule that is known to interact with RNA-editing enzymes through this mechanism. We obtained an ensemble of 1000 transition trajectories to describe the base-flipping process. We used the likelihood maximization method to determine the refined reaction coordinate (RC) consisting of two collective variables (CVs), a distance and a dihedral angle between nucleotides that form stacking interactions with the flipping base. The free energy profile projected along the refined RC revealed three minima, two corresponding to the initial and final states and one for a metastable state. We suggest that the metastable state likely represents a wobbled conformation of nucleobases observed in NMR studies that is often characterized as the flipped state. The analyses of reactive trajectories further revealed that the base flipping is coupled to a global conformational change in a stem-loop of dsRNA.more » « less
An official website of the United States government

