skip to main content


Title: New scaling relations to compute atom-in-material polarizabilities and dispersion coefficients: part 1. Theory and accuracy
Polarizabilities and London dispersion forces are important to many chemical processes. Force fields for classical atomistic simulations can be constructed using atom-in-material polarizabilities and C n ( n = 6, 8, 9, 10…) dispersion coefficients. This article addresses the key question of how to efficiently assign these parameters to constituent atoms in a material so that properties of the whole material are better reproduced. We develop a new set of scaling laws and computational algorithms (called MCLF) to do this in an accurate and computationally efficient manner across diverse material types. We introduce a conduction limit upper bound and m -scaling to describe the different behaviors of surface and buried atoms. We validate MCLF by comparing results to high-level benchmarks for isolated neutral and charged atoms, diverse diatomic molecules, various polyatomic molecules ( e.g. , polyacenes, fullerenes, and small organic and inorganic molecules), and dense solids (including metallic, covalent, and ionic). We also present results for the HIV reverse transcriptase enzyme complexed with an inhibitor molecule. MCLF provides the non-directionally screened polarizabilities required to construct force fields, the directionally-screened static polarizability tensor components and eigenvalues, and environmentally screened C 6 coefficients. Overall, MCLF has improved accuracy compared to the TS-SCS method. For TS-SCS, we compared charge partitioning methods and show DDEC6 partitioning yields more accurate results than Hirshfeld partitioning. MCLF also gives approximations for C 8 , C 9 , and C 10 dispersion coefficients and quantum Drude oscillator parameters. This method should find widespread applications to parameterize classical force fields and density functional theory (DFT) + dispersion methods.  more » « less
Award ID(s):
1555376
NSF-PAR ID:
10122309
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
RSC Advances
Volume:
9
Issue:
34
ISSN:
2046-2069
Page Range / eLocation ID:
19297 to 19324
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present two algorithms to compute system-specific polarizabilities and dispersion coefficients such that required memory and computational time scale linearly with increasing number of atoms in the unit cell for large systems. The first algorithm computes the atom-in-material (AIM) static polarizability tensors, force-field polarizabilities, and C 6 , C 8 , C 9 , C 10 dispersion coefficients using the MCLF method. The second algorithm computes the AIM polarizability tensors and C 6 coefficients using the TS-SCS method. Linear-scaling computational cost is achieved using a dipole interaction cutoff length function combined with iterative methods that avoid large dense matrix multiplies and large matrix inversions. For MCLF, Richardson extrapolation of the screening increments is used. For TS-SCS, a failproof conjugate residual (FCR) algorithm is introduced that solves any linear equation system having Hermitian coefficients matrix. These algorithms have mathematically provable stable convergence that resists round-off errors. We parallelized these methods to provide rapid computation on multi-core computers. Excellent parallelization efficiencies were obtained, and adding parallel processors does not significantly increase memory requirements. This enables system-specific polarizabilities and dispersion coefficients to be readily computed for materials containing millions of atoms in the unit cell. The largest example studied herein is an ice crystal containing >2 million atoms in the unit cell. For this material, the FCR algorithm solved a linear equation system containing >6 million rows, 7.57 billion interacting atom pairs, 45.4 billion stored non-negligible matrix components used in each large matrix-vector multiplication, and ∼19 million unknowns per frequency point (>300 million total unknowns). 
    more » « less
  2. A host of important performance properties for metal–organic frameworks (MOFs) and other complex materials can be calculated by modeling statistical ensembles. The principle challenge is to develop accurate and computationally efficient interaction models for these simulations. Two major approaches are (i) ab initio molecular dynamics in which the interaction model is provided by an exchange–correlation theory ( e.g. , DFT + dispersion functional) and (ii) molecular mechanics in which the interaction model is a parameterized classical force field. The first approach requires further development to improve computational speed. The second approach requires further development to automate accurate forcefield parameterization. Because of the extreme chemical diversity across thousands of MOF structures, this problem is still mostly unsolved today. For example, here we show structures in the 2014 CoRE MOF database contain more than 8 thousand different atom types based on first and second neighbors. Our results showed that atom types based on both first and second neighbors adequately capture the chemical environment, but atom types based on only first neighbors do not. For 3056 MOFs, we used density functional theory (DFT) followed by DDEC6 atomic population analysis to extract a host of important forcefield precursors: partial atomic charges; atom-in-material (AIM) C 6 , C 8 , and C 10 dispersion coefficients; AIM dipole and quadrupole moments; various AIM polarizabilities; quantum Drude oscillator parameters; AIM electron cloud parameters; etc. Electrostatic parameters were validated through comparisons to the DFT-computed electrostatic potential. These forcefield precursors should find widespread applications to developing MOF force fields. 
    more » « less
  3. This data set for the manuscript entitled "Design of Peptides that Fold and Self-Assemble on Graphite" includes all files needed to run and analyze the simulations described in the this manuscript in the molecular dynamics software NAMD, as well as the output of the simulations. The files are organized into directories corresponding to the figures of the main text and supporting information. They include molecular model structure files (NAMD psf or Amber prmtop format), force field parameter files (in CHARMM format), initial atomic coordinates (pdb format), NAMD configuration files, Colvars configuration files, NAMD log files, and NAMD output including restart files (in binary NAMD format) and trajectories in dcd format (downsampled to 10 ns per frame). Analysis is controlled by shell scripts (Bash-compatible) that call VMD Tcl scripts or python scripts. These scripts and their output are also included.

    Version: 2.0

    Changes versus version 1.0 are the addition of the free energy of folding, adsorption, and pairing calculations (Sim_Figure-7) and shifting of the figure numbers to accommodate this addition.


    Conventions Used in These Files
    ===============================

    Structure Files
    ----------------
    - graph_*.psf or sol_*.psf (original NAMD (XPLOR?) format psf file including atom details (type, charge, mass), as well as definitions of bonds, angles, dihedrals, and impropers for each dipeptide.)

    - graph_*.pdb or sol_*.pdb (initial coordinates before equilibration)
    - repart_*.psf (same as the above psf files, but the masses of non-water hydrogen atoms have been repartitioned by VMD script repartitionMass.tcl)
    - freeTop_*.pdb (same as the above pdb files, but the carbons of the lower graphene layer have been placed at a single z value and marked for restraints in NAMD)
    - amber_*.prmtop (combined topology and parameter files for Amber force field simulations)
    - repart_amber_*.prmtop (same as the above prmtop files, but the masses of non-water hydrogen atoms have been repartitioned by ParmEd)

    Force Field Parameters
    ----------------------
    CHARMM format parameter files:
    - par_all36m_prot.prm (CHARMM36m FF for proteins)
    - par_all36_cgenff_no_nbfix.prm (CGenFF v4.4 for graphene) The NBFIX parameters are commented out since they are only needed for aromatic halogens and we use only the CG2R61 type for graphene.
    - toppar_water_ions_prot_cgenff.str (CHARMM water and ions with NBFIX parameters needed for protein and CGenFF included and others commented out)

    Template NAMD Configuration Files
    ---------------------------------
    These contain the most commonly used simulation parameters. They are called by the other NAMD configuration files (which are in the namd/ subdirectory):
    - template_min.namd (minimization)
    - template_eq.namd (NPT equilibration with lower graphene fixed)
    - template_abf.namd (for adaptive biasing force)

    Minimization
    -------------
    - namd/min_*.0.namd

    Equilibration
    -------------
    - namd/eq_*.0.namd

    Adaptive biasing force calculations
    -----------------------------------
    - namd/eabfZRest7_graph_chp1404.0.namd
    - namd/eabfZRest7_graph_chp1404.1.namd (continuation of eabfZRest7_graph_chp1404.0.namd)

    Log Files
    ---------
    For each NAMD configuration file given in the last two sections, there is a log file with the same prefix, which gives the text output of NAMD. For instance, the output of namd/eabfZRest7_graph_chp1404.0.namd is eabfZRest7_graph_chp1404.0.log.

    Simulation Output
    -----------------
    The simulation output files (which match the names of the NAMD configuration files) are in the output/ directory. Files with the extensions .coor, .vel, and .xsc are coordinates in NAMD binary format, velocities in NAMD binary format, and extended system information (including cell size) in text format. Files with the extension .dcd give the trajectory of the atomic coorinates over time (and also include system cell information). Due to storage limitations, large DCD files have been omitted or replaced with new DCD files having the prefix stride50_ including only every 50 frames. The time between frames in these files is 50 * 50000 steps/frame * 4 fs/step = 10 ns. The system cell trajectory is also included for the NPT runs are output/eq_*.xst.

    Scripts
    -------
    Files with the .sh extension can be found throughout. These usually provide the highest level control for submission of simulations and analysis. Look to these as a guide to what is happening. If there are scripts with step1_*.sh and step2_*.sh, they are intended to be run in order, with step1_*.sh first.


    CONTENTS
    ========

    The directory contents are as follows. The directories Sim_Figure-1 and Sim_Figure-8 include README.txt files that describe the files and naming conventions used throughout this data set.

    Sim_Figure-1: Simulations of N-acetylated C-amidated amino acids (Ac-X-NHMe) at the graphite–water interface.

    Sim_Figure-2: Simulations of different peptide designs (including acyclic, disulfide cyclized, and N-to-C cyclized) at the graphite–water interface.

    Sim_Figure-3: MM-GBSA calculations of different peptide sequences for a folded conformation and 5 misfolded/unfolded conformations.

    Sim_Figure-4: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-5: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 295 K.

    Sim_Figure-5_replica: Temperature replica exchange molecular dynamics simulations for the peptide cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) with 20 replicas for temperatures from 295 to 454 K.

    Sim_Figure-6: Simulation of the peptide molecule cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) in free solution (no graphite).

    Sim_Figure-7: Free energy calculations for folding, adsorption, and pairing for the peptide CHP1404 (sequence: cyc(GTGSGTG-GPGG-GCGTGTG-SGPG)). For folding, we calculate the PMF as function of RMSD by replica-exchange umbrella sampling (in the subdirectory Folding_CHP1404_Graphene/). We make the same calculation in solution, which required 3 seperate replica-exchange umbrella sampling calculations (in the subdirectory Folding_CHP1404_Solution/). Both PMF of RMSD calculations for the scrambled peptide are in Folding_scram1404/. For adsorption, calculation of the PMF for the orientational restraints and the calculation of the PMF along z (the distance between the graphene sheet and the center of mass of the peptide) are in Adsorption_CHP1404/ and Adsorption_scram1404/. The actual calculation of the free energy is done by a shell script ("doRestraintEnergyError.sh") in the 1_free_energy/ subsubdirectory. Processing of the PMFs must be done first in the 0_pmf/ subsubdirectory. Finally, files for free energy calculations of pair formation for CHP1404 are found in the Pair/ subdirectory.

    Sim_Figure-8: Simulation of four peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) where the peptides are far above the graphene–water interface in the initial configuration.

    Sim_Figure-9: Two replicates of a simulation of nine peptide molecules with the sequence cyc(GTGSGTG-GPGG-GCGTGTG-SGPG) at the graphite–water interface at 370 K.

    Sim_Figure-9_scrambled: Two replicates of a simulation of nine peptide molecules with the control sequence cyc(GGTPTTGGGGGGSGGPSGTGGC) at the graphite–water interface at 370 K.

    Sim_Figure-10: Adaptive biasing for calculation of the free energy of the folded peptide as a function of the angle between its long axis and the zigzag directions of the underlying graphene sheet.

     

    This material is based upon work supported by the US National Science Foundation under grant no. DMR-1945589. A majority of the computing for this project was performed on the Beocat Research Cluster at Kansas State University, which is funded in part by NSF grants CHE-1726332, CNS-1006860, EPS-1006860, and EPS-0919443. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562, through allocation BIO200030. 
    more » « less
  4. Abstract

    The earlier integration of validated Lennard–Jones (LJ) potentials for 8 fcc metals into materials and biomolecular force fields has advanced multiple research fields, for example, metal–electrolyte interfaces, recognition of biomolecules, colloidal assembly of metal nanostructures, alloys, and catalysis. Here we introduce 12-6 and 9-6 LJ parameters for classical all-atom simulations of 10 further fcc metals (Ac, Ca (α), Ce (γ), Es (β), Fe (γ), Ir, Rh, Sr (α), Th (α), Yb (β)) and stainless steel. The parameters reproduce lattice constants, surface energies, water interfacial energies, and interactions with (bio)organic molecules in 0.1 to 5% agreement with experiment, as well as qualitative mechanical properties under standard conditions. Deviations are reduced up to a factor of one hundred in comparison to earlier Lennard–Jones parameters, embedded atom models, and density functional theory. We also explain a quantitative correlation between atomization energies from experiments and surface energies that supports parameter development. The models are computationally very efficient and applicable to an exponential space of alloys. Compatibility with a wide range of force fields such as the Interface force field (IFF), AMBER, CHARMM, COMPASS, CVFF, DREIDING, OPLS-AA, and PCFF enables reliable simulations of nanostructures up to millions of atoms and microsecond time scales. User-friendly model building and input generation are available in the CHARMM-GUI Nanomaterial Modeler. As a limitation, deviations in mechanical properties vary and are comparable to DFT methods. We discuss the incorporation of reactivity and features of the electronic structure to expand the range of applications and further increase the accuracy.

     
    more » « less
  5. Databases of experimentally-derived metal–organic framework (MOF) crystal structures are useful for large-scale computational screening to identify which MOFs are best-suited for particular applications. However, these crystal structures must be cleaned to identify and/or correct various artifacts. The recently published 2019 CoRE MOF database (Chung et al. , J. Chem. Eng. Data , 2019, 64 , 5985–5998) reported thousands of experimentally-derived crystal structures that were partially cleaned to remove solvent molecules, to identify hundreds of disordered structures (approximately thirty of those were corrected), and to manually correct approximately 100 structures ( e.g. , adding missing hydrogen atoms). Herein, further cleaning of the 2019 CoRE MOF database is performed to identify structures with misbonded or isolated atoms: (i) structures containing an isolated atom, (ii) structures containing atoms too close together ( i.e. , overlapping atoms), (iii) structures containing a misplaced hydrogen atom, (iv) structures containing an under-bonded carbon atom (which might be caused by missing hydrogen atoms), and (v) structures containing an over-bonded carbon atom. This study should not be viewed as the final cleaning of this database, but rather as progress along the way towards the goal of someday achieving a completely cleaned set of experimentally-derived MOF crystal structures. We performed atom typing for all of the accepted structures to identify those structures that can be parameterized by previously reported forcefield precursors (Chen and Manz, RSC Adv ., 2019, 9 , 36492–36507). We report several forcefield precursors ( e.g. , net atomic charges, atom-in-material polarizabilities, atom-in-material dispersion coefficients, electron cloud parameters, etc. ) for more than five thousand MOFs in the 2019 CoRE MOF database. 
    more » « less