skip to main content


Title: NUScon: a community-driven platform for quantitative evaluation of nonuniform sampling in NMR
Abstract. Although the concepts of nonuniform sampling (NUS​​​​​​​) and non-Fourier spectral reconstruction in multidimensional NMR began to emerge 4 decades ago (Bodenhausen and Ernst, 1981; Barna and Laue, 1987), it is only relatively recently that NUS has become more commonplace. Advantages of NUS include the ability to tailor experiments to reduce data collection time and to improve spectral quality, whether through detection of closely spaced peaks (i.e., “resolution”) or peaks of weak intensity (i.e., “sensitivity”). Wider adoption of these methods is the result of improvements in computational performance, a growing abundance and flexibility of software, support from NMR spectrometer vendors, and the increased data sampling demands imposed by higher magnetic fields. However, the identification of best practices still remains a significant and unmet challenge. Unlike the discrete Fourier transform, non-Fourier methods used to reconstruct spectra from NUS data are nonlinear, depend on the complexity and nature of the signals, and lack quantitative or formal theory describing their performance. Seemingly subtle algorithmic differences may lead to significant variabilities in spectral qualities and artifacts. A community-based critical assessment of NUS challenge problems has been initiated, called the “Nonuniform Sampling Contest” (NUScon), with the objective of determining best practices for processing and analyzing NUS experiments. We address this objective by constructing challenges from NMR experiments that we inject with synthetic signals, and we process these challenges using workflows submitted by the community. In the initial rounds of NUScon our aim is to establish objective criteria for evaluating the quality of spectral reconstructions. We present here a software package for performing the quantitative analyses, and we present the results from the first two rounds of NUScon. We discuss the challenges that remain and present a roadmap for continued community-driven development with the ultimate aim of providing best practices in this rapidly evolving field. The NUScon software package and all data from evaluating the challenge problems are hosted on the NMRbox platform.  more » « less
Award ID(s):
1660921
NSF-PAR ID:
10348041
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; ; « less
Date Published:
Journal Name:
Magnetic Resonance
Volume:
2
Issue:
2
ISSN:
2699-0016
Page Range / eLocation ID:
843 to 861
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Non-invasive and label-free spectral microscopy (spectromicroscopy) techniques can provide quantitative biochemical information complementary to genomic sequencing, transcriptomic profiling, and proteomic analyses. However, spectromicroscopy techniques generate high-dimensional data; acquisition of a single spectral image can range from tens of minutes to hours, depending on the desired spatial resolution and the image size. This substantially limits the timescales of observable transient biological processes. To address this challenge and move spectromicroscopy towards efficient real-time spatiochemical imaging, we developed a grid-less autonomous adaptive sampling method. Our method substantially decreases image acquisition time while increasing sampling density in regions of steeper physico-chemical gradients. When implemented with scanning Fourier Transform infrared spectromicroscopy experiments, this grid-less adaptive sampling approach outperformed standard uniform grid sampling in a two-component chemical model system and in a complex biological sample,Caenorhabditis elegans. We quantitatively and qualitatively assess the efficiency of data acquisition using performance metrics and multivariate infrared spectral analysis, respectively.

     
    more » « less
  2. Abstract The number and diversity of phenological studies has increased rapidly in recent years. Innovative experiments, field studies, citizen science projects, and analyses of newly available historical data are contributing insights that advance our understanding of ecological and evolutionary responses to the environment, particularly climate change. However, many phenological data sets have peculiarities that are not immediately obvious and can lead to mistakes in analyses and interpretation of results. This paper aims to help researchers, especially those new to the field of phenology, understand challenges and practices that are crucial for effective studies. For example, researchers may fail to account for sampling biases in phenological data, struggle to choose or design a volunteer data collection strategy that adequately fits their project’s needs, or combine data sets in inappropriate ways. We describe ten best practices for designing studies of plant and animal phenology, evaluating data quality, and analyzing data. Practices include accounting for common biases in data, using effective citizen or community science methods, and employing appropriate data when investigating phenological mismatches. We present these best practices to help researchers entering the field take full advantage of the wealth of available data and approaches to advance our understanding of phenology and its implications for ecology. 
    more » « less
  3. The heightened dipolar interactions in solids render solid-state NMR (ssNMR) spectra more difficult to interpret than solution NMR spectra. On the other hand, ssNMR does not suffer from severe molecular weight limitations like solution NMR. In recent years, ssNMR has undergone rapid technological developments that have enabled structure–function studies of increasingly larger biomolecules, including membrane proteins. Current methodology includes stable isotope labeling schemes, non-uniform sampling with spectral reconstruction, faster magic angle spinning, and innovative pulse sequences that capture different types of interactions among spins. However, computational tools for the analysis of complex ssNMR data from membrane proteins and other challenging protein systems have lagged behind those for solution NMR. Before a structure can be determined, thousands of signals from individual types of multidimensional ssNMR spectra of samples, which may have differing isotopic composition, must be recognized, correlated, categorized, and eventually assigned to atoms in the chemical structure. To address these tedious steps, we have developed an automated algorithm for ssNMR spectra called “ssPINE”. The ssPINE software accepts the sequence of the protein plus peak lists from a variety of ssNMR experiments as inputs and offers automated backbone and side-chain assignments. The alpha version of ssPINE, which we describe here, is freely available through a web submission form. 
    more » « less
  4. Abstract

    Rapid progress in machine learning offers new opportunities for the automated analysis of multidimensional NMR spectra ranging from protein NMR to metabolomics applications. Most recently, it has been demonstrated how deep neural networks (DNN) designed for spectral peak picking are capable of deconvoluting highly crowded NMR spectra rivaling the facilities of human experts. Superior DNN-based peak picking is one of a series of critical steps during NMR spectral processing, analysis, and interpretation where machine learning is expected to have a major impact. In this perspective, we lay out some of the unique strengths as well as challenges of machine learning approaches in this new era of automated NMR spectral analysis. Such a discussion seems timely and should help define common goals for the NMR community, the sharing of software tools, standardization of protocols, and calibrate expectations. It will also help prepare for an NMR future where machine learning and artificial intelligence tools will be common place.

     
    more » « less
  5. We address the problem of learning a sparsifying graph Fourier transform (GFT) for compressible signals on directed graphs (digraphs). Blending the merits of Fourier and dictionary learning representations, the goal is to obtain an orthonormal basis that captures spread modes of signal variation with respect to the underlying network topology, and yields parsimonious representations of bandlimited signals. Accordingly, we learn a data-adapted dictionary by minimizing a spectral dispersion criterion over the achievable frequency range, along with a sparsity-promoting regularization term on the GFT coefficients of training signals. An iterative algorithm is developed which alternates between minimizing a smooth objective over the Stiefel manifold, and soft-thresholding the graph-spectral domain representations of the signals in the training set. A frequency analysis of temperature measurements recorded across the contiguous United States illustrates the merits of the novel GFT design. 
    more » « less