skip to main content


Title: Analysis of the conformational properties of amine ligands at the gold/water interface with QM, MM and QM/MM simulations
We describe a strategy of integrating quantum mechanical (QM), hybrid quantum mechanical/molecular mechanical (QM/MM) and MM simulations to analyze the physical properties of a solid/water interface. This protocol involves using a correlated ab initio (CCSD(T)) method to first calibrate Density Functional Theory (DFT) as the QM approach, which is then used in QM/MM simulations to compute relevant free energy quantities at the solid/water interface using a mean-field approximation of Yang et al. that decouples QM and MM thermal fluctuations; gas-phase QM/MM and periodic DFT calculations are used to determine the proper QM size in the QM/MM simulations. Finally, the QM/MM free energy results are compared with those obtained from MM simulations to directly calibrate the force field model for the solid/water interface. This protocol is illustrated by examining the orientations of an alkyl amine ligand at the gold/water interface, since the ligand conformation is expected to impact the chemical properties ( e.g. , charge) of the solid surface. DFT/MM and MM simulations using the INTERFACE force field lead to consistent results, suggesting that the effective gold/ligand interactions can be adequately described by a van der Waals model, while electrostatic and induction effects are largely quenched by solvation. The observed differences among periodic DFT, QM/MM and MM simulations, nevertheless, suggest that explicitly including electronic polarization and potentially charge transfer in the MM model can be important to the quantitative accuracy. The strategy of integrating multiple computational methods to cross-validate each other for complex interfaces is applicable to many problems that involve both inorganic/metallic and organic/biomolecular components, such as functionalized nanoparticles.  more » « less
Award ID(s):
1503408
NSF-PAR ID:
10060329
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Physical Chemistry Chemical Physics
Volume:
20
Issue:
5
ISSN:
1463-9076
Page Range / eLocation ID:
3349 to 3362
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Methods which accurately predict protein – ligand binding strengths are critical for drug discovery. In the last two decades, advances in chemical modelling have enabled steadily accelerating progress in the discovery and optimization of structure-based drug design. Most computational methods currently used in this context are based on molecular mechanics force fields that often have deficiencies in describing the quantum mechanical (QM) aspects of molecular binding. In this study, we show the competitiveness of our QM-based Molecules-in-Molecules (MIM) fragmentation method for characterizing binding energy trends for seven different datasets of protein – ligand complexes. By using molecular fragmentation, the MIM method allows for accelerated QM calculations. We demonstrate that for classes of structurally similar ligands bound to a common receptor, MIM provides excellent correlation to experiment, surpassing the more popular Molecular Mechanics Poisson-Boltzmann Surface Area (MM/PBSA) and Molecular Mechanics Generalized Born Surface Area (MM/GBSA) methods. The MIM method offers a relatively simple, well-defined protocol by which binding trends can be ascertained at the QM level and is suggested as a promising option for lead optimization in structure-based drug design. 
    more » « less
  2. Resistance to carbapenem β-lactams presents major clinical and economical challenges for the treatment of pathogen infections. The fast hydrolysis of carbapenems by carbapenemase-producing bacterial strains enables the effective deactivation of carbapenem antibiotics. In this study, we aim to unravel the structural features that distinguish the notable deacylation activity of carbapenemases. The deacylation reactions between imipenem (IPM) and the KPC-2 class A serine-based β-lactamases (ASβLs) are modeled with combined quantum mechanical/molecular mechanical (QM/MM) minimum energy pathway (MEP) calculations and interpretable machine-learning (ML) methods. We first applied a dual-level computational protocol to achieve fast sampling of QM/MM MEPs. A tree-based ensemble ML model was employed to learn the MEP activation barriers from the conformational features of the KPC-2/IPM active site. The barrier-predicting model was then unboxed using the Shapley additive explanation (SHAP) importance attribution methods to derive mechanistic insights, which were also verified by additional QM/MM wavefunction analysis. Essentially, we show that potential hydrogen bonding interactions of the general base and the tautomerization states of the carbapenem pyrroline ring could concertedly regulate the activation barrier of KPC-2/IPM deacylation. Nonetheless, we demonstrate the efficacy of interpretable ML to assist the analysis of QM/MM simulation data for robust extraction of human-interpretable mechanistic insights. 
    more » « less
  3. This dataset consists of 800 coordinate files (in the CHARMM psf/cor format) for the QM/MM minimum energy pathways of the acylation reactions between a Class A beta-lactamases (Toho-1) and two beta-lactam antibiotic molecules (ampicillin and cefalexin).

    These files are:

    • toho_amp.r1-ae.zip: The R1-AE acylation pathways for Toho-1/Ampicillin (200 pathways);
    • toho_amp.r2-ae.zip: The R2-AE acylation pathways for Toho-1/Ampicillin (200 pathways);
    • toho_cex.r1-ae.zip: The R1-AE acylation pathways for Toho-1/Cefalexin (200 pathways);
    • toho_cex.r2-ae.zip: The R2-AE acylation pathways for Toho-1/Cefalexin (200 pathways);
    • energies.zip: the replica energies at B3LYP-D3/6-31+G**/C36 level;
    • chelpgs.zip: the ChElPG charges of all reactant replicas at B3LYP-D3/6-31+G**/C36 level;
    • farrys.zip: the featurzied NumPy arrays for model training;
    • peephole.zip: an example file for how the optimized MEPs look like; 
    • dftb3_benchmark.zip: the reference calculations to justify the use of DFTB3/3OB-F/C36 in MEP optimizations, the reference level of theory is B3LYP-D3/6-31G**/C36. 

    The R1-AE pathways are the acylation uses Glu166 as the general base; the R2-AE pathways uses Lys73 and Glu166 as the concerted base. 

    All QM/MM pathways are optimized at the DFTB3/3OB-f/CHARMM36 level of theory. 

    Z. Song et al Mechanistic Insights into Enzyme Catalysis from Explaining Machine-Learned Quantum Mechanical and Molecular Mechanical Minimum Energy Pathways. ACS Physical Chemistry Au, in press. DOI: 10.1021/acsphyschemau.2c00005

     
    more » « less
  4. We present a computational protocol for the fast and automated screening of excited-state hybrid quantum mechanics/molecular mechanics (QM/MM) models of rhodopsins to be used as fluorescent probes based on the automatic rhodopsin modeling protocol (a-ARM). Such “a-ARM fluorescence screening protocol” is implemented through a general Python-based driver, PyARM, that is also proposed here. The implementation and performance of the protocol are benchmarked using different sets of rhodopsin variants whose absorption and, more relevantly, emission spectra have been experimentally assessed. We show that, despite important limitations that make unsafe to use it as a black-box tool, the protocol reproduces the observed trends in fluorescence and it is capable of selecting novel potentially fluorescent rhodopsins. We also show that the protocol can be used in mechanistic investigations to discern fluorescence enhancement effects associated with a near degeneracy of the S1/S2 states or, alternatively, with a barrier generated via coupling of the S0/S1 wave functions. 
    more » « less
  5. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less