skip to main content

Title: “Solvent hydrogen‐bond occlusion”: A new model of polar desolvation for biomolecular energetics

Water engages in two important types of interactions near biomolecules: it forms ordered “cages” around exposed hydrophobic regions, and it participates in hydrogen bonds with surface polar groups. Both types of interaction are critical to biomolecular structure and function, but explicitly including an appropriate number of solvent molecules makes many applications computationally intractable. A number of implicit solvent models have been developed to address this problem, many of which treat these two solvation effects separately. Here, we describe a new model to capture polar solvation effects, called SHO (“solvent hydrogen‐bond occlusion”); our model aims to directly evaluate the energetic penalty associated with displacing discrete first‐shell water molecules near each solute polar group. We have incorporated SHO into the Rosetta energy function, and find that scoring protein structures with SHO provides superior performance in loop modeling, virtual screening, and protein structure prediction benchmarks. These improvements stem from the fact that SHO accurately identifies and penalizes polar groups that do not participate in hydrogen bonds, either with solvent or with other solute atoms (“unsatisfied” polar groups). We expect that in future, SHO will enable higher‐resolution predictions for a variety of molecular modeling applications. © 2017 Wiley Periodicals, Inc.

more » « less
Author(s) / Creator(s):
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Journal of Computational Chemistry
Page Range / eLocation ID:
p. 1321-1331
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. n→π* interactions between consecutive carbonyls stabilize the α-helix and polyproline II helix (PPII) conformations in proteins. n→π* interactions have been suggested to provide significant conformational biases to the disordered states of proteins. To understand the roles of solvation on the strength of n→π* interactions, computational investigations were conducted on a model n→π* interaction, the twisted-parallel-offset formaldehyde dimer, as a function of explicit solvation of the donor and acceptor carbonyls, using water and HF. In addition, the effects of urea, thiourea, guanidinium, and monovalent cations on n→π* interaction strength were examined. Solvation of the acceptor carbonyl significantly strengthens the n→π* interaction, while solvation of the donor carbonyl only modestly weakens the n→π* interaction. The n→π* interaction strength was maximized with two solvent molecules on the acceptor carbonyl. Urea stabilized the n→π* interaction via simultaneous engagement of both oxygen lone pairs on the acceptor carbonyl. Solvent effects were further investigated in the model peptides Ac-Pro-NMe 2 , Ac-Ala-NMe 2 , and Ac-Pro 2 -NMe 2 . Solvent effects in peptides were similar to those in the formaldehyde dimer, with solvation of the acceptor carbonyl increasing n→π* interaction strength and resulting in more compact conformations, in both the proline endo and exo ring puckers, as well as a reduction in the energy difference between these ring puckers. Carbonyl solvation leads to an energetic preference for PPII over both the α-helix and β/extended conformations, consistent with experimental data that protic solvents and protein denaturants both promote PPII. Solvation of the acceptor carbonyl weakens the intraresidue C5 hydrogen bond that stabilizes the β conformation. 
    more » « less
  2. Abstract

    The 3D reference interaction site model (3D‐RISM) of molecular solvation is a powerful tool for computing the equilibrium thermodynamics and density distributions of solvents, such as water and co‐ions, around solute molecules. However, 3D‐RISM solutions can be expensive to calculate, especially for proteins and other large molecules where calculating the potential energy between solute and solvent requires more than half the computation time. To address this problem, we have developed and implemented treecode summation for long‐range interactions and analytically corrected cut‐offs for short‐range interactions to accelerate the potential energy and long‐range asymptotics calculations in non‐periodic 3D‐RISM in the AmberTools molecular modeling suite. For the largest single protein considered in this work, tubulin, the total computation time was reduced by a factor of 4. In addition, parallel calculations with these new methods scale almost linearly and the iterative solver remains the largest impediment to parallel scaling. To demonstrate the utility of our approach for large systems, we used 3D‐RISM to calculate the solvation thermodynamics and density distribution of 7‐ring microtubule, consisting of 910 tubulin dimers, over 1.2 million atoms.

    more » « less
  3. Non-toxic, chemically inert, organic polymers as polyethylene glycol (PEG) and polyoxymethylene (POM) have versatile applications in basic research, industry and pharmacy. In this work, we aim to characterize the hydration structure of PEG and POM oligomers by exploring how the solute disturbs the water structure compared to the bulk solvent and how the solute chain interacts with the solvent. We explore the effect of (i) the C–C–O (PEG) versus C-O (POM) constitution of the chain and (ii) chain length. To this end, MD simulations followed by clustering and topological analysis of the hydration network, as well as quantum mechanical calculations of atomic charges are used. We show that the hydration varies with chain conformation and length. The degree of folding of the chain impacts its degree of solvation, which is measurable by different parameters as for example the number of water molecules in the first solvation shell and the solvent accessible surface. Atomic charges calculated on the oligomers in gas phase are stable throughout conformation and chain length and seem not to determine solvation. Hydration however induces charge transfer from the solute molecule to the solvent, which depends on the degree of hydration. 
    more » « less
  4. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.

    Conventions Used in These Files

    Structure Files
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    - namd/min_*.0.namd

    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.


    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  5. Classical molecular dynamics simulations of the hydration thermodynamics, structure, and dynamics of water in hydration shells of charged buckminsterfullerenes are presented in this study. Charging of fullerenes leads to a structural transition in the hydration shell, accompanied by creation of a significant population of dangling O–H bonds pointing toward the solute. In contrast to the well accepted structure–function paradigm, this interfacial structural transition causes nearly no effect on either the dynamics of hydration water or on the solvation thermodynamics. Linear response to the solute charge is maintained despite significant structural changes in the hydration shell, and solvation thermodynamic potentials are nearly insensitive to the altering structure. Only solvation heat capacities, which are higher thermodynamic derivatives of the solvation free energy, indicate some sensitivity to the local hydration structure. We have separated the solvation thermodynamic potentials into direct solute–solvent interactions and restructuring of the hydration shell and analyzed the relative contributions of electrostatic and nonpolar interactions to the solvation thermodynamics. 
    more » « less