skip to main content


Title: Computing conformational free energy differences in explicit solvent: An efficient thermodynamic cycle using an auxiliary potential and a free energy functional constructed from the end points

Many biomolecules undergo conformational changes associated with allostery or ligand binding. Observing these changes in computer simulations is difficult if their timescales are long. These calculations can be accelerated by observing the transition on an auxiliary free energy surface with a simpler Hamiltonian and connecting this free energy surface to the target free energy surface with free energy calculations. Here, we show that the free energy legs of the cycle can be replaced with energy representation (ER) density functional approximations. We compute: (1) The conformational free energy changes for alanine dipeptide transitioning from the right‐handed free energy basin to the left‐handed basin and (2) the free energy difference between the open and closed conformations ofβ‐cyclodextrin, a “host” molecule that serves as a model for molecular recognition in host‐guest binding.β‐cyclodextrin contains 147 atoms compared to 22 atoms for alanine dipeptide, makingβ‐cyclodextrin a large molecule for which to compute solvation free energies by free energy perturbation or integration methods and the largest system for which the ER method has been compared to exact free energy methods. The ER method replaced the 28 simulations to compute each coupling free energy with two endpoint simulations, reducing the computational time for the alanine dipeptide calculation by about 70% and for theβ‐cyclodextrin by > 95%. The method works even when the distribution of conformations on the auxiliary free energy surface differs substantially from that on the target free energy surface, although some degree of overlap between the two surfaces is required. © 2016 Wiley Periodicals, Inc.

 
more » « less
NSF-PAR ID:
10038943
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Computational Chemistry
Volume:
38
Issue:
15
ISSN:
0192-8651
Page Range / eLocation ID:
p. 1198-1208
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  2. Metadynamics calculations of large chemical systems with ab initio methods are computationally prohibitive due to the extensive sampling required to simulate the large degrees of freedom in these systems. To address this computational bottleneck, we utilized a GPU-enhanced density functional tight binding (DFTB) approach on a massively parallelized cloud computing platform to efficiently calculate the thermodynamics and metadynamics of biochemical systems. To first validate our approach, we calculated the free-energy surfaces of alanine dipeptide and showed that our GPU-enhanced DFTB calculations qualitatively agree with computationally-intensive hybrid DFT benchmarks, whereas classical force fields give significant errors. Most importantly, we show that our GPU-accelerated DFTB calculations are significantly faster than previous approaches by up to two orders of magnitude. To further extend our GPU-enhanced DFTB approach, we also carried out a 10 ns metadynamics simulation of remdesivir, which is prohibitively out of reach for routine DFT-based metadynamics calculations. We find that the free-energy surfaces of remdesivir obtained from DFTB and classical force fields differ significantly, where the latter overestimates the internal energy contribution of high free-energy states. Taken together, our benchmark tests, analyses, and extensions to large biochemical systems highlight the use of GPU-enhanced DFTB simulations for efficiently predicting the free-energy surfaces/thermodynamics of large biochemical systems. 
    more » « less
  3. Abstract

    Gaussian accelerated molecular dynamics (GaMD) is a robust computational method for simultaneous unconstrained enhanced sampling and free energy calculations of biomolecules. It works by adding a harmonic boost potential to smooth biomolecular potential energy surface and reduce energy barriers. GaMD greatly accelerates biomolecular simulations by orders of magnitude. Without the need to set predefined reaction coordinates or collective variables, GaMD provides unconstrained enhanced sampling and is advantageous for simulating complex biological processes. The GaMD boost potential exhibits a Gaussian distribution, thereby allowing for energetic reweighting via cumulant expansion to the second order (i.e., “Gaussian approximation”). This leads to accurate reconstruction of free energy landscapes of biomolecules. Hybrid schemes with other enhanced sampling methods, such as the replica‐exchange GaMD (rex‐GaMD) and replica‐exchange umbrella sampling GaMD (GaREUS), have also been introduced, further improving sampling and free energy calculations. Recently, new “selective GaMD” algorithms including the Ligand GaMD (LiGaMD) and Peptide GaMD (Pep‐GaMD) enabled microsecond simulations to capture repetitive dissociation and binding of small‐molecule ligands and highly flexible peptides. The simulations then allowed highly efficient quantitative characterization of the ligand/peptide binding thermodynamics and kinetics. Taken together, GaMD and its innovative variants are applicable to simulate a wide variety of biomolecular dynamics, including protein folding, conformational changes and allostery, ligand binding, peptide binding, protein–protein/nucleic acid/carbohydrate interactions, and carbohydrate/nucleic acid interactions. In this review, we present principles of the GaMD algorithms and recent applications in biomolecular simulations and drug design.

    This article is categorized under:

    Structure and Mechanism > Computational Biochemistry and Biophysics

    Molecular and Statistical Mechanics > Molecular Dynamics and Monte‐Carlo Methods

    Molecular and Statistical Mechanics > Free Energy Methods

     
    more » « less
  4. The study of phenomena such as protein folding and conformational changes in molecules is a central theme in chemical physics. Molecular dynamics (MD) simulation is the primary tool for the study of transition processes in biomolecules, but it is hampered by a huge timescale gap between the processes of interest and atomic vibrations that dictate the time step size. Therefore, it is imperative to combine MD simulations with other techniques in order to quantify the transition processes taking place on large timescales. In this work, the diffusion map with Mahalanobis kernel, a meshless approach for approximating the Backward Kolmogorov Operator (BKO) in collective variables, is upgraded to incorporate standard enhanced sampling techniques, such as metadynamics. The resulting algorithm, which we call the target measure Mahalanobis diffusion map (tm-mmap), is suitable for a moderate number of collective variables in which one can approximate the diffusion tensor and free energy. Imposing appropriate boundary conditions allows use of the approximated BKO to solve for the committor function and utilization of transition path theory to find the reactive current delineating the transition channels and the transition rate. The proposed algorithm, tm-mmap, is tested on the two-dimensional Moro–Cardin two-well system with position-dependent diffusion coefficient and on alanine dipeptide in two collective variables where the committor, the reactive current, and the transition rate are compared to those computed by the finite element method (FEM). Finally, tm-mmap is applied to alanine dipeptide in four collective variables where the use of finite elements is infeasible.

     
    more » « less
  5. Abstract

    DNA polymer‐wrapped single‐walled carbon nanotubes (SWNTs) finds a widespread use in a variety of nanotechnology applications. Molecular dynamics (MD) simulations and experiments are used to explore the relationship between structural conformation, binding affinity, and kinetic stability for short single‐stranded oligonucleotides adsorbed on SWNTs. The conformation of 36 oligonucleotide sequences on (9,4) SWNT is computationally screened, where the polymer lengths are selected so the polymers can, to a first approximation, wrap once around the SWNT circumference. The identified conformations can be broadly classified into “rings” and “non‐rings.” Then, 2D conformational free energy landscapes for selected sequences are obtained by temperature replica exchange calculations. Propensity for “ring” conformations are driven primarily by sequence chemistry and the ability of the polymer to form compact structures. However, ring‐formation probability is found to be uncorrelated with free energy of oligonucleotide binding to SWNTs (∆Gbind). Conformational analyses of oligonucleotides, computed free energy of binding of oligonucleotides to SWNTs, and experimentally determined kinetic stability measurements show that ∆Gbindis the primary correlate for kinetic stability. The probability of the sequence to adopt a compact, ring‐like conformation is shown to play a secondary role that still contributes measurably and positively to kinetic stability.

     
    more » « less