skip to main content

Title: Automatic mutual information noise omission (AMINO): generating order parameters for molecular systems
Molecular dynamics (MD) simulations generate valuable all-atom resolution trajectories of complex systems, but analyzing this high-dimensional data as well as reaching practical timescales, even with powerful supercomputers, remain open problems. As such, many specialized sampling and reaction coordinate construction methods exist that alleviate these problems. However, these methods typically don't work directly on all atomic coordinates, and still require previous knowledge of the important distinguishing features of the system, known as order parameters (OPs). Here we present AMINO, an automated method that generates such OPs by screening through a very large dictionary of OPs, such as all heavy atom contacts in a biomolecule. AMINO uses ideas from information theory to learn OPs that can then serve as an input for designing a reaction coordinate which can then be used in many enhanced sampling methods. Here we outline its key theoretical underpinnings, and apply it to systems of increasing complexity. Our applications include a problem of tremendous pharmaceutical and engineering relevance, namely, calculating the binding affinity of a protein–ligand system when all that is known is the structure of the bound system. Our calculations are performed in a human-free fashion, obtaining very accurate results compared to long unbiased MD simulations on the Anton supercomputer, but in orders of magnitude less computer time. We thus expect AMINO to be useful for the calculation of thermodynamics and kinetics in the study of diverse molecular systems.  more » « less
Award ID(s):
1806833 1632976
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Molecular Systems Design & Engineering
Page Range / eLocation ID:
339 to 348
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution.

    more » « less
  2. Abstract

    Immobile four-way junctions (4WJs) are core structural motifs employed in the design of programmed DNA assemblies. Understanding the impact of sequence on their equilibrium structure and flexibility is important to informing the design of complex DNA architectures. While core junction sequence is known to impact the preferences for the two possible isomeric states that junctions reside in, previous investigations have not quantified these preferences based on molecular-level interactions. Here, we use all-atom molecular dynamics simulations to investigate base-pair level structure and dynamics of four-way junctions, using the canonical Seeman J1 junction as a reference. Comparison of J1 with equivalent single-crossover topologies and isolated nicked duplexes reveal conformational impact of the double-crossover motif. We additionally contrast J1 with a second junction core sequence termed J24, with equal thermodynamic preference for each isomeric configuration. Analyses of the base-pair degrees of freedom for each system, free energy calculations, and reduced-coordinate sampling of the 4WJ isomers reveal the significant impact base sequence has on local structure, isomer bias, and global junction dynamics.

    more » « less
  3. Small integration time steps limit molecular dynamics (MD) simulations to millisecond time scales. Markov state models (MSMs) and equation-free approaches learn low-dimensional kinetic models from MD simulation data by performing configurational or dynamical coarse-graining of the state space. The learned kinetic models enable the efficient generation of dynamical trajectories over vastly longer time scales than are accessible by MD, but the discretization of configurational space and/or absence of a means to reconstruct molecular configurations precludes the generation of continuous all-atom molecular trajectories. We propose latent space simulators (LSS) to learn kinetic models for continuous all-atom simulation trajectories by training three deep learning networks to (i) learn the slow collective variables of the molecular system, (ii) propagate the system dynamics within this slow latent space, and (iii) generatively reconstruct molecular configurations. We demonstrate the approach in an application to Trp-cage miniprotein to produce novel ultra-long synthetic folding trajectories that accurately reproduce all-atom molecular structure, thermodynamics, and kinetics at six orders of magnitude lower cost than MD. The dramatically lower cost of trajectory generation enables greatly improved sampling and greatly reduced statistical uncertainties in estimated thermodynamic averages and kinetic rates. 
    more » « less
  4. The morphology of semiconducting polymer thin films is known to have a profound effect on their opto-electronic properties. Although considerable efforts have been made to control and understand the processes which influence the structures of these systems, it remains largely unclear what physical factors determine the arrangement of polymer chains in spin-cast films. Here, we investigate the role that the liquid–vapor interfaces in chlorobenzene solutions of poly(3-hexylthiophene) [P3HT] play in the conformational geometries adopted by the polymers. Using all-atom molecular dynamics (MD), and supported by toy-model simulations, we demonstrate that, with increasing concentration, P3HT oligomers in solution exhibit a strong propensity for the liquid–vapor interface. Due to the differential solubility of the backbone and side chains of the oligomers, in the vicinity of this interface, hexyl chains and the thiophene rings, have a clear orientational preference with respect to the liquid surface. At high concentrations, we additionally establish a substantial degree of inter-oligomer alignment and thiophene ring stacking near the interface. Our results broadly concur with the limited existing experimental evidence and we suggest that the interfacial structure can act as a template for film structure. We argue that the differences in solvent affinity of the side chain and backbone moieties are the driving force for the anisotropic orientations of the polymers near the interface. This finer grained description contrasts with the usual monolithic characterization of polymer units. Since this phenomenon can be controlled by concurrent chemical design and the choice of solvents, this work establishes a fabrication principle which may be useful to develop more highly functional polymer films. 
    more » « less
  5. Protein-peptide interactions play essential roles in many cellular processes and their structural characterization is the major focus of current experimental and theoretical research. Two decades ago, it was proposed to employ the steered molecular dynamics (SMD) to assess the strength of protein-peptide interactions. The idea behind using SMD simulations is that the mechanical stability can be used as a promising and an efficient alternative to computationally highly demanding estimation of binding affinity. However, mechanical stability defined as a peak in force-extension profile depends on the choice of the pulling direction. Here we propose an uncommon choice of the pulling direction along resultant dipole moment (RDM) vector, which has not been explored in SMD simulations so far. Using explicit solvent all-atom MD simulations, we apply SMD technique to probe mechanical resistance of ligand-receptor system pulled along two different vectors. A novel pulling direction—when ligand unbinds along the RDM vector—results in stronger forces compared to commonly used ligand unbinding along center of masses vector. Our observation that RDM is one of the factors influencing the mechanical stability of protein-peptide complex can be used to improve the ranking of binding affinities by using mechanical stability as an effective scoring function. 
    more » « less