skip to main content

Title: DEEP Picker is a Deep Neural Network for Accurate Deconvolution of Complex Two-Dimensional NMR Spectra
The analysis of nuclear magnetic resonance (NMR) spectra for the comprehensive and unambiguous identification and characterization of peaks is a difficult, but critically important step in all NMR analyses of complex biological molecular systems. Here, we introduce DEEP Picker, a deep neural network (DNN)-based approach for peak picking and spectral deconvolution which semi-automates the analysis of two-dimensional NMR spectra. DEEP Picker includes 8 hidden convolutional layers and was trained on a large number of synthetic spectra of known composition with variable degrees of crowdedness. We show that our method is able to correctly identify overlapping peaks, including ones that are challenging for expert spectroscopists and existing computational methods alike. We demonstrate the utility of DEEP Picker on NMR spectra of folded and intrinsically disordered proteins as well as a complex metabolomics mixture, and show how it provides access to valuable NMR information. DEEP Picker should facilitate the semi-automation and standardization of protocols for better consistency and sharing of results within the scientific community.
Authors:
; ; ; ;
Award ID(s):
2103637
Publication Date:
NSF-PAR ID:
10332410
Journal Name:
Nature communications
ISSN:
2041-1723
Sponsoring Org:
National Science Foundation
More Like this
  1. We demonstrate that natural isotopic abundance 2D heteronuclear correlation (HETCOR) solid-state NMR spectra can be used to significantly reduce or eliminate the broadening of 1 H and 13 C solid-state NMR spectra of organic solids due to anisotropic bulk magnetic susceptibility (ABMS). ABMS often manifests in solids with aromatic groups, such as active pharmaceutical ingredients (APIs), and inhomogeneously broadens the NMR peaks of all nuclei in the sample. Inhomogeneous peaks with full widths at half maximum (FWHM) of ∼1 ppm typically result from ABMS broadening and the low spectral resolution impedes the analysis of solid-state NMR spectra. ABMS broadening of solid-state NMR spectra has previously been eliminated using 2D multiple-quantum correlation experiments, or by performing NMR experiments on diluted materials or single crystals. However, these experiments are often infeasible due to their poor sensitivity and/or provide limited gains in resolution. 2D 1 H– 13 C HETCOR experiments have previously been applied to reduce susceptibility broadening in paramagnetic solids and we show that this strategy can significantly reduce ABMS broadening in diamagnetic organic solids. Comparisons of 1D solid-state NMR spectra and 1 H and 13 C solid-state NMR spectra obtained from 2D 1 H– 13 C HETCOR NMR spectra show thatmore »the HETCOR spectrum directly increases resolution by a factor of 1.5 to 8. The direct gain in resolution is determined by the ratio of the inhomogeneous 13 C/ 1 H linewidth to the homogeneous 1 H linewidth, with the former depending on the magnitude of the ABMS broadening and the strength of the applied field and the latter on the efficiency of homonuclear decoupling. The direct gains in resolution obtained using the 2D HETCOR experiments are better than that obtained by dilution. For solids with long proton longitudinal relaxation times, dynamic nuclear polarization (DNP) was applied to enhance sensitivity and enable the acquisition of 2D 1 H– 13 C HETCOR NMR spectra. 2D 1 H– 13 C HETCOR experiments were applied to resolve and partially assign the NMR signals of the form I and form II polymorphs of aspirin in a sample containing both forms. These findings have important implications for ultra-high field NMR experiments, optimization of decoupling schemes and assessment of the fundamental limits on the resolution of solid-state NMR spectra.« less
  2. In exploring the conformational behavior of cyclic tungsten bis-alkyne complexes, two dialkynylamides (14a and 14c) and two dialkynylesters (14b and 14d) derived from 1,1’-ferrocenedicarboxylic acid were prepared. They were subsequently reacted with W(CO)3(dmtc)2 to yield the desired cyclic tungsten bis-alkyne complexes 8-11. In the cyclization of 14a to yield 8 a dimeric macrocyclic complex, 15, featuring two tungsten bis-alkyne complexes in the ring, also was isolated. The conformational behavior of these complexes was assessed by analysis of the 1H NMR resonances for the alkyne hydrogens, which appear around 11 ppm. The spectra for complexes 10, 11 and 15 show multiple singlets of varying integrations for these protons, while the spectra for complexes 8 and 9 show only two resonances of equal integration for the alkyne hydrogens. The spectra for 8 and 9 changed very little when examined at higher temperatures, indicating that the solution conformation is robust. A ROESY spectrum was obtained for 8. It did not show any crosspeaks between the two alkyne hydrogens. The NMR data shows that the alkyne ligands in 10, 11 and 15 are able to rotate about the tungsten-alkyne bond; these complexes adopted several different solution conformations relating to syn and anti arrangements ofmore »the alkyne ligands. In contrast, complexes 8 and 9 adopt only one solution conformation, and the alkyne ligands in these species do not rotate about the tungsten-alkyne bond. The NMR spectra for 8 and 9 also show that these complexes are asymmetric. The 1H NMR spectra for 8 and 9 show that each hydrogen atom has its own unique resonance in the 1H NMR spectrum. There are 8 resonances for the 8 Cp protons, 4 resonances for the methylene protons, 2 resonances for the alkyne protons, and in the case of 8, 2 resonances for the NH protons. The two NH protons on complex 8 were found to have widely different chemical shifts. A DMSO titration was performed and it showed that one of the two NH protons in 8 is involved in an intramolecular hydrogen bond. Given that the diester 9 adopts a similar conformation as the diamide 8, this intramolecular hydrogen bond appears to result from the conformation imposed by cyclization of the ring system. Overall, the data show that the ring system for 8 and 9 provides a unique, rigid, robust, and air stable cyclic molecule where the alkyne ligands are limited to one orientation, presumably the syn orientation. The lack of mobility for the alkyne ligands limits the cyclic molecule to only one solution conformation. Complexes 8 and 9 are the first reported examples of cyclic tungsten bis-alkyne complexes that only adopt a single, robust conformation in solution.« less
  3. In this paper, we develop structure assisted nonnegative matrix factorization (NMF) methods for blind source separation of degenerate data. The motivation originates from nuclear magnetic resonance (NMR) spectroscopy, where a multiple mixture NMR spectra are recorded to identify chemical compounds with similar structures. Consider the linear mixing model (LMM), we aim to identify the chemical compounds involved when the mixing process is known to be nearly singular. We first consider a class of data with dominant interval(s) (DI) where each of source signals has dominant peaks over others. Besides, a nearly singular mixing process produces degenerate mixtures. The DI condition implies clustering structures in the data points. Hence, the estimation of the mixing matrix could be achieved by data clustering. Due to the presence of the noise and the degeneracy of the data, a small deviation in the estimation may introduce errors in the output. To resolve this problem and improve robustness of the separation, methods are developed in two aspects. One is to find better estimation of the mixing matrix by allowing a constrained perturbation to the clustering output, and it can be achieved by a quadratic programming. The other is to seek sparse source signals by exploiting themore »DI condition, and it solves an 1 optimization. If no source information is available, we propose to adopt the nonnegative matrix factorization approach by incorporating the matrix structure (parallel columns of the mixing matrix) into the cost function and develop multiplicative iteration rules for the numerical solutions. We present experimental results of NMR data to show the performance and reliability of the method in the applications arising in NMR spectroscopy.« less
  4. Abstract Background Access to quantitative information is crucial to obtain a deeper understanding of biological systems. In addition to being low-throughput, traditional image-based analysis is mostly limited to error-prone qualitative or semi-quantitative assessment of phenotypes, particularly for complex subcellular morphologies. The PVD neuron in Caenorhabditis elegans , which is responsible for harsh touch and thermosensation, undergoes structural degeneration as nematodes age characterized by the appearance of dendritic protrusions. Analysis of these neurodegenerative patterns is labor-intensive and limited to qualitative assessment. Results In this work, we apply deep learning to perform quantitative image-based analysis of complex neurodegeneration patterns exhibited by the PVD neuron in C. elegans . We apply a convolutional neural network algorithm (Mask R-CNN) to identify neurodegenerative subcellular protrusions that appear after cold-shock or as a result of aging. A multiparametric phenotypic profile captures the unique morphological changes induced by each perturbation. We identify that acute cold-shock-induced neurodegeneration is reversible and depends on rearing temperature and, importantly, that aging and cold-shock induce distinct neuronal beading patterns. Conclusion The results of this work indicate that implementing deep learning for challenging image segmentation of PVD neurodegeneration enables quantitatively tracking subtle morphological changes in an unbiased manner. This analysis revealed that distinctmore »patterns of morphological alteration are induced by aging and cold-shock, suggesting different mechanisms at play. This approach can be used to identify the molecular components involved in orchestrating neurodegeneration and to characterize the effect of other stressors on PVD degeneration.« less
  5. Background subtraction is a general problem in spectroscopy often addressed with application-specific techniques, or methods that introduce a variety of implementation barriers such as having to specify peak-free regions of the spectrum. An iterative dual-tree complex wavelet transform-based background subtraction method (DTCWT-IA) was recently developed for the analysis of ultrafast electron diffraction patterns. The method was designed to require minimal user intervention, to support streamlined analysis of many diffraction patterns with complex overlapping peaks and time-varying backgrounds, and is implemented in an open-source computer program. We examined the performance of DTCWT-IA for the analysis of spectra acquired by a range of optical spectroscopies including ultraviolet–visible spectroscopy (UV–Vis), X-ray photoelectron spectroscopy (XPS), and surface-enhanced Raman spectroscopy (SERS). A key benefit of the method is that the user need not specify regions of the spectrum where no peaks are expected to occur. SER spectra were used to investigate the robustness of DTCWT-IA to signal-to-noise levels in the spectrum and to user operation, specifically to two of the algorithm parameter settings: decomposition level and iteration number. The single, general DTCWT-IA implementation performs well in comparison to the different conventional approaches to background subtraction for UV–Vis, XPS, and SERS, while requiring minimal input. Themore »method thus holds the same potential for optical spectroscopy as for ultrafast electron diffraction, namely streamlined analysis of spectra with complex distributions of peaks and varying signal levels, thus supporting real-time spectral analysis or the analysis of data acquired from different sources.« less