Abstract The number and diversity of phenological studies has increased rapidly in recent years. Innovative experiments, field studies, citizen science projects, and analyses of newly available historical data are contributing insights that advance our understanding of ecological and evolutionary responses to the environment, particularly climate change. However, many phenological data sets have peculiarities that are not immediately obvious and can lead to mistakes in analyses and interpretation of results. This paper aims to help researchers, especially those new to the field of phenology, understand challenges and practices that are crucial for effective studies. For example, researchers may fail to account for sampling biases in phenological data, struggle to choose or design a volunteer data collection strategy that adequately fits their project’s needs, or combine data sets in inappropriate ways. We describe ten best practices for designing studies of plant and animal phenology, evaluating data quality, and analyzing data. Practices include accounting for common biases in data, using effective citizen or community science methods, and employing appropriate data when investigating phenological mismatches. We present these best practices to help researchers entering the field take full advantage of the wealth of available data and approaches to advance our understanding of phenology and its implications for ecology.
more »
« less
NUScon: a community-driven platform for quantitative evaluation of nonuniform sampling in NMR
Abstract. Although the concepts of nonuniform sampling (NUS) and non-Fourier spectral reconstruction in multidimensional NMR began to emerge 4 decades ago (Bodenhausen and Ernst, 1981; Barna and Laue, 1987), it is only relatively recently that NUS has become more commonplace. Advantages of NUS include the ability to tailor experiments to reduce data collection time and to improve spectral quality, whether through detection of closely spaced peaks (i.e., “resolution”) or peaks of weak intensity (i.e., “sensitivity”). Wider adoption of these methods is the result of improvements in computational performance, a growing abundance and flexibility of software, support from NMR spectrometer vendors, and the increased data sampling demands imposed by higher magnetic fields. However, the identification of best practices still remains a significant and unmet challenge. Unlike the discrete Fourier transform, non-Fourier methods used to reconstruct spectra from NUS data are nonlinear, depend on the complexity and nature of the signals, and lack quantitative or formal theory describing their performance. Seemingly subtle algorithmic differences may lead to significant variabilities in spectral qualities and artifacts. A community-based critical assessment of NUS challenge problems has been initiated, called the “Nonuniform Sampling Contest” (NUScon), with the objective of determining best practices for processing and analyzing NUS experiments. We address this objective by constructing challenges from NMR experiments that we inject with synthetic signals, and we process these challenges using workflows submitted by the community. In the initial rounds of NUScon our aim is to establish objective criteria for evaluating the quality of spectral reconstructions. We present here a software package for performing the quantitative analyses, and we present the results from the first two rounds of NUScon. We discuss the challenges that remain and present a roadmap for continued community-driven development with the ultimate aim of providing best practices in this rapidly evolving field. The NUScon software package and all data from evaluating the challenge problems are hosted on the NMRbox platform.
more »
« less
- Award ID(s):
- 1660921
- PAR ID:
- 10348041
- Author(s) / Creator(s):
- ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
- Date Published:
- Journal Name:
- Magnetic Resonance
- Volume:
- 2
- Issue:
- 2
- ISSN:
- 2699-0016
- Page Range / eLocation ID:
- 843 to 861
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background: Quantification of metabolites from nuclear magnetic resonance (NMR) spectra in an accurate, high-throughput manner requires effective data processing tools. Neural networks are relatively underexplored in quantitative NMR metabolomics despite impressive speed and throughput compared to more conventional peak-fitting metabolomics software. Methods: This work investigates practices for dataset and model development in the task of metabolite quantification directly from simulated NMR spectra for three neural network models: the multi-layered perceptron, the convolutional neural network, and the transformer. Model architectures, training parameters, and training datasets are optimized before comparing each model on simulated 400-MHz 1H-NMR spectra of complex mixtures with 8, 44, or 86 metabolites to quantify in spectra ranging from simple to highly complex and overlapping peaks. The optimized models were further validated on spectra at 100- and 800-MHz. Results: The transformer was the most effective network for NMR metabolite quantification, especially as the number of metabolites per spectra increased or target concentrations were low or had a large dynamic range. Further, the transformer was able to accurately quantify metabolites in simulated spectra from 100-MHz up to 800-MHz. Conclusions: The methods developed in this work reveal that transformers have the potential to accurately perform fully automated metabolite quantification in real-time and, with further development with experimental data, could be the basis for automated quantitative NMR metabolomics software.more » « less
-
The heightened dipolar interactions in solids render solid-state NMR (ssNMR) spectra more difficult to interpret than solution NMR spectra. On the other hand, ssNMR does not suffer from severe molecular weight limitations like solution NMR. In recent years, ssNMR has undergone rapid technological developments that have enabled structure–function studies of increasingly larger biomolecules, including membrane proteins. Current methodology includes stable isotope labeling schemes, non-uniform sampling with spectral reconstruction, faster magic angle spinning, and innovative pulse sequences that capture different types of interactions among spins. However, computational tools for the analysis of complex ssNMR data from membrane proteins and other challenging protein systems have lagged behind those for solution NMR. Before a structure can be determined, thousands of signals from individual types of multidimensional ssNMR spectra of samples, which may have differing isotopic composition, must be recognized, correlated, categorized, and eventually assigned to atoms in the chemical structure. To address these tedious steps, we have developed an automated algorithm for ssNMR spectra called “ssPINE”. The ssPINE software accepts the sequence of the protein plus peak lists from a variety of ssNMR experiments as inputs and offers automated backbone and side-chain assignments. The alpha version of ssPINE, which we describe here, is freely available through a web submission form.more » « less
-
Abstract The Fourier domain acceleration search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy data sets. This paper quantifies the sensitivity impact of reducing numerical precision in the graphics processing unit (GPU)-accelerated FDAS pipeline of the AstroAccelerate (AA) software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline, spending a large fraction of the runtime computing GPU-accelerated fast Fourier transforms. AA has been modified to use bfloat16 (and IEEE-754 double-precision to provide a “gold standard” comparison) within the Fourier domain convolution section of the FDAS routine. Approximately 20,000 synthetic pulsar filterbank files representing binary pulsars were generated using SIGPROC with a range of physical parameters. They have been processed using bfloat16, single-precision, and double-precision convolutions. All bfloat16 peaks are within 3% of the predicted signal-to-noise ratio of their corresponding single-precision peaks. Of 14,971 “bright” single-precision fundamental peaks above a power of 44.982 (our experimentally measured highest noise value), 14,602 (97.53%) have a peak in the same acceleration and frequency bin in the bfloat16 output plane, while in the remaining 369 the nearest peak is located in the adjacent acceleration bin. There is no bin drift measured between the single- and double-precision results. The bfloat16 version of FDAS achieves a speedup of approximately 1.6× compared to single-precision. A comparison between AA and the PRESTO software package is presented using observations collected with the GMRT of PSR J1544+4937, a 2.16 ms black widow pulsar in a 2.8 hr compact orbit.more » « less
-
Abstract Plant metabolomes are structurally diverse. One of the most popular techniques for sampling this diversity is liquid chromatography–mass spectrometry (LC‐MS), which typically detects thousands of peaks from single organ extracts, many representing true metabolites. These peaks are usually annotated using in‐house retention time or spectral libraries, in silico fragmentation libraries, and increasingly through computational techniques such as machine learning. Despite these advances, over 85% of LC‐MS peaks remain unidentified, posing a major challenge for data analysis and biological interpretation. This bottleneck limits our ability to fully understand the diversity, functions, and evolution of plant metabolites. In this review, we first summarize current approaches for metabolite identification, highlighting their challenges and limitations. We further focus on alternative strategies that bypass the need for metabolite identification, allowing researchers to interpret global metabolic patterns and pinpoint key metabolite signals. These methods include molecular networking, distance‐based approaches, information theory–based metrics, and discriminant analysis. Additionally, we explore their practical applications in plant science and highlight a set of useful tools to support researchers in analyzing complex plant metabolomics data. By adopting these approaches, researchers can enhance their ability to uncover new insights into plant metabolism.more » « less
An official website of the United States government

