skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


Title: Real-time prediction of 1 H and 13 C chemical shifts with DFT accuracy using a 3D graph neural network
Nuclear magnetic resonance (NMR) is one of the primary techniques used to elucidate the chemical structure, bonding, stereochemistry, and conformation of organic compounds. The distinct chemical shifts in an NMR spectrum depend upon each atom's local chemical environment and are influenced by both through-bond and through-space interactions with other atoms and functional groups. The in silico prediction of NMR chemical shifts using quantum mechanical (QM) calculations is now commonplace in aiding organic structural assignment since spectra can be computed for several candidate structures and then compared with experimental values to find the best possible match. However, the computational demands of calculating multiple structural- and stereo-isomers, each of which may typically exist as an ensemble of rapidly-interconverting conformations, are expensive. Additionally, the QM predictions themselves may lack sufficient accuracy to identify a correct structure. In this work, we address both of these shortcomings by developing a rapid machine learning (ML) protocol to predict 1 H and 13 C chemical shifts through an efficient graph neural network (GNN) using 3D structures as input. Transfer learning with experimental data is used to improve the final prediction accuracy of a model trained using QM calculations. When tested on the CHESHIRE dataset, the proposed model predicts observed 13 C chemical shifts with comparable accuracy to the best-performing DFT functionals (1.5 ppm) in around 1/6000 of the CPU time. An automated prediction webserver and graphical interface are accessible online at http://nova.chem.colostate.edu/cascade/. We further demonstrate the model in three applications: first, we use the model to decide the correct organic structure from candidates through experimental spectra, including complex stereoisomers; second, we automatically detect and revise incorrect chemical shift assignments in a popular NMR database, the NMRShiftDB; and third, we use NMR chemical shifts as descriptors for determination of the sites of electrophilic aromatic substitution.  more » « less
Award ID(s):
1925607
NSF-PAR ID:
10352897
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Chemical Science
Volume:
12
Issue:
36
ISSN:
2041-6520
Page Range / eLocation ID:
12012 to 12026
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    This manuscript describes predicted NMR shifts for the limonoid natural product xylogranatin F. The1H and13C NMR shifts of four diastereomers were evaluated by GIAO and hybrid DFT/parametric DU8+ methods. The results of the1H and13C NMR calculations for both the GIAO method and the DU8+ calculations suggest the revised structure that was recently reassigned by chemical synthesis. Furthermore, we show that while DU8+ provides superior accuracy with less computation time, GIAO points to the correct structure with more distinguishable data in this case study.

     
    more » « less
  2. Abstract

    In oriented‐sample (OS) solid‐state NMR of membrane proteins, the angular‐dependent dipolar couplings and chemical shifts provide a direct input for structure calculations. However, so far only1H–15N dipolar couplings and15N chemical shifts have been routinely assessed in oriented15N‐labeled samples. The main obstacle for extending this technique to membrane proteins of arbitrary topology has remained in the lack of additional experimental restraints. We have developed a new experimental triple‐resonance NMR technique, which was applied to uniformly doubly (15N,13C)‐labeled Pf1 coat protein in magnetically aligned DMPC/DHPC bicelles. The previously inaccessible1Hα13Cαdipolar couplings have been measured, which make it possible to determine the torsion angles between the peptide planes without assuming α‐helical structure a priori. The fitting of three angular restraints per peptide plane and filtering by Rosetta scoring functions has yielded a consensus α‐helical transmembrane structure for Pf1 protein.

     
    more » « less
  3. Abstract

    In oriented‐sample (OS) solid‐state NMR of membrane proteins, the angular‐dependent dipolar couplings and chemical shifts provide a direct input for structure calculations. However, so far only1H–15N dipolar couplings and15N chemical shifts have been routinely assessed in oriented15N‐labeled samples. The main obstacle for extending this technique to membrane proteins of arbitrary topology has remained in the lack of additional experimental restraints. We have developed a new experimental triple‐resonance NMR technique, which was applied to uniformly doubly (15N,13C)‐labeled Pf1 coat protein in magnetically aligned DMPC/DHPC bicelles. The previously inaccessible1Hα13Cαdipolar couplings have been measured, which make it possible to determine the torsion angles between the peptide planes without assuming α‐helical structure a priori. The fitting of three angular restraints per peptide plane and filtering by Rosetta scoring functions has yielded a consensus α‐helical transmembrane structure for Pf1 protein.

     
    more » « less
  4. We investigate 29Si nuclear magnetic resonance (NMR) chemical shifts, δiso, of silicon nitride. Our goal is to relate the local structure to the NMR signal and, thus, provide the means to extract more information from the experimental 29Si NMR spectra in this family of compounds. We apply structural modeling and the gauge-included projector augmented wave (GIPAW) method within density functional theory (DFT) calculations. Our models comprise known and hypothetical crystalline Si3N4, as well as amorphous Si3N4 structures. We find good agreement with available experimental 29Si NMR data for tetrahedral Si[4] and octahedral Si[6] in crystalline Si3N4, predict the chemical shift of a trigonal-bipyramidal Si[5] to be about −120 ppm, and quantify the impact of Si-N bond lengths on 29Si δiso. We show through computations that experimental 29Si NMR data indicates that silicon dicarbodiimide, Si(NCN)2 exhibits bent Si-N-C units with angles of about 143° in its structure. A detailed investigation of amorphous silicon nitride shows that an observed peak asymmetry relates to the proximity of a fifth N neighbor in non-bonding distance between 2.5 and 2.8 Å to Si. We reveal the impact of both Si-N(H)-Si bond angle and Si-N bond length on 29Si δiso in hydrogenated silicon nitride structure, silicon diimide Si(NH)2. 
    more » « less
  5. null (Ed.)
    Inferring molecular structure from Nuclear Magnetic Resonance (NMR) measurements requires an accurate forward model that can predict chemical shifts from 3D structure. Current forward models are limited to specific molecules like proteins and state-of-the-art models are not differentiable. Thus they cannot be used with gradient methods like biased molecular dynamics. Here we use graph neural networks (GNNs) for NMR chemical shift prediction. Our GNN can model chemical shifts accurately and capture important phenomena like hydrogen bonding induced downfield shift between multiple proteins, secondary structure effects, and predict shifts of organic molecules. Previous empirical NMR models of protein NMR have relied on careful feature engineering with domain expertise. These GNNs are trained from data alone with no feature engineering yet are as accurate and can work on arbitrary molecular structures. The models are also efficient, able to compute one million chemical shifts in about 5 seconds. This work enables a new category of NMR models that have multiple interacting types of macromolecules. 
    more » « less