skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Unsupervised Analysis of Small Molecule Mixtures by Wavelet-Based Super-Resolved NMR
Resolving small molecule mixtures by nuclear magnetic resonance (NMR) spectroscopy has been of great interest for a long time for its precision, reproducibility, and efficiency. However, spectral analyses for such mixtures are often highly challenging due to overlapping resonance lines and limited chemical shift windows. The existing experimental and theoretical methods to produce shift NMR spectra in dealing with the problem have limited applicability owing to sensitivity issues, inconsistency, and/or the requirement of prior knowledge. Recently, we resolved the problem by decoupling multiplet structures in NMR spectra by the wavelet packet transform (WPT) technique. In this work, we developed a scheme for deploying the method in generating highly resolved WPT NMR spectra and predicting the composition of the corresponding molecular mixtures from their 1H NMR spectra in an automated fashion. The four-step spectral analysis scheme consists of calculating the WPT spectrum, peak matching with a WPT shift NMR library, followed by two optimization steps in producing the predicted molecular composition of a mixture. The robustness of the method was tested on an augmented dataset of 1000 molecular mixtures, each containing 3 to 7 molecules. The method successfully predicted the constituent molecules with a median true positive rate of 1.0 against the varying compositions, while a median false positive rate of 0.04 was obtained. The approach can be scaled easily for much larger datasets.  more » « less
Award ID(s):
2044599
PAR ID:
10483504
Author(s) / Creator(s):
;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Molecules
Volume:
28
Issue:
2
ISSN:
1420-3049
Page Range / eLocation ID:
792
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Straightforward identification of chiral molecules in multi-component mixtures of unknown composition is extremely challenging. Current spectrometric and chromatographic methods cannot unambiguously identify components while the state of the art spectroscopic methods are limited by the difficult and time-consuming task of spectral assignment. Here, we introduce a highly sensitive generalized version of microwave three-wave mixing that uses broad-spectrum fields to detect chiral molecules in enantiomeric excess without any prior chemical knowledge of the sample. This method does not require spectral assignment as a necessary step to extract information out of a spectrum. We demonstrate our method by recording three-wave mixing spectra of multi-component samples that provide direct evidence of enantiomeric excess. Our method opens up new capabilities in ultrasensitive phase-coherent spectroscopic detection that can be applied for chiral detection in real-life mixtures, raw products of chemical reactions and difficult to assign novel exotic species. 
    more » « less
  2. In this work, the second-order kinetics of molecules exchanging between chemically distinct microenvironments, such as those found in nanoemulsions, is studied using nuclear magnetic resonance (NMR). A unique aspect of NMR exchange studies in nanoemulsions is that the difference in molecular resonance frequencies between the two phases, which determines whether the exchange is fast, intermediate, or slow on the NMR timescale, can depend upon the emulsion droplet composition, which is also determined by the kinetic exchange constants themselves. Within the fast-exchange regime, changes in resonance frequencies and line widths with dilution were used to extract the exchange rate constants from the NMR spectra in a manner analogous to determining the kinetic parameters in NMR ligand binding experiments. As a demonstration, the kinetic exchange parameters of isoflurane release from an emulsification of isoflurane and perflurotributylamine (FC43) were determined using NMR dilution and diffusion studies. 
    more » « less
  3. The analysis of nuclear magnetic resonance (NMR) spectra for the comprehensive and unambiguous identification and characterization of peaks is a difficult, but critically important step in all NMR analyses of complex biological molecular systems. Here, we introduce DEEP Picker, a deep neural network (DNN)-based approach for peak picking and spectral deconvolution which semi-automates the analysis of two-dimensional NMR spectra. DEEP Picker includes 8 hidden convolutional layers and was trained on a large number of synthetic spectra of known composition with variable degrees of crowdedness. We show that our method is able to correctly identify overlapping peaks, including ones that are challenging for expert spectroscopists and existing computational methods alike. We demonstrate the utility of DEEP Picker on NMR spectra of folded and intrinsically disordered proteins as well as a complex metabolomics mixture, and show how it provides access to valuable NMR information. DEEP Picker should facilitate the semi-automation and standardization of protocols for better consistency and sharing of results within the scientific community. 
    more » « less
  4. null (Ed.)
    Inferring molecular structure from Nuclear Magnetic Resonance (NMR) measurements requires an accurate forward model that can predict chemical shifts from 3D structure. Current forward models are limited to specific molecules like proteins and state-of-the-art models are not differentiable. Thus they cannot be used with gradient methods like biased molecular dynamics. Here we use graph neural networks (GNNs) for NMR chemical shift prediction. Our GNN can model chemical shifts accurately and capture important phenomena like hydrogen bonding induced downfield shift between multiple proteins, secondary structure effects, and predict shifts of organic molecules. Previous empirical NMR models of protein NMR have relied on careful feature engineering with domain expertise. These GNNs are trained from data alone with no feature engineering yet are as accurate and can work on arbitrary molecular structures. The models are also efficient, able to compute one million chemical shifts in about 5 seconds. This work enables a new category of NMR models that have multiple interacting types of macromolecules. 
    more » « less
  5. null (Ed.)
    Abstract. We present a rapid method for apportioning the sources of atmospheric organic aerosol composition measured by gas chromatography–mass spectrometry methods. Here, we specifically apply this new analysis method to data acquired on a thermal desorption aerosol gas chromatograph (TAG) system. Gas chromatograms are divided by retention time into evenly spaced bins, within which the mass spectra are summed. A previous chromatogram binning method was introduced for the purpose of chromatogram structure deconvolution (e.g., major compound classes) (Zhang et al., 2014). Here we extend the method development for the specific purpose of determining aerosol samples' sources. Chromatogram bins are arranged into an input data matrix for positive matrix factorization (PMF), where the sample number is the row dimension and the mass-spectra-resolved eluting time intervals (bins) are the column dimension. Then two-dimensional PMF can effectively do three-dimensional factorization on the three-dimensional TAG mass spectra data. The retention time shift of the chromatogram is corrected by applying the median values of the different peaks' shifts. Bin width affects chemical resolution but does not affect PMF retrieval of the sources' time variations for low-factor solutions. A bin width smaller than the maximum retention shift among all samples requires retention time shift correction. A six-factor PMF comparison among aerosol mass spectrometry (AMS), TAG binning, and conventional TAG compound integration methods shows that the TAG binning method performs similarly to the integration method. However, the new binning method incorporates the entirety of the data set and requires significantly less pre-processing of the data than conventional single compound identification and integration. In addition, while a fraction of the most oxygenated aerosol does not elute through an underivatized TAG analysis, the TAG binning method does have the ability to achieve molecular level resolution on other bulk aerosol components commonly observed by the AMS. 
    more » « less