Tandem mass spectrometry (MS/MS) is crucial for small-molecule analysis; however, traditional computational methods are limited by incomplete reference libraries and complex data processing. Machine learning (ML) is transforming small-molecule mass spectrometry in three key directions: (a) predicting MS/MS spectra and related physicochemical properties to expand reference libraries, (b) improving spectral matching through automated pattern extraction, and (c) predicting molecular structures of compounds directly from their MS/MS spectra. We review ML approaches for molecular representations [descriptors, simplified molecular-input line-entry (SMILE) strings, and graphs] and MS/MS spectra representations (using binned vectors and peak lists) along with recent advances in spectra prediction, retention time, collision cross sections, and spectral matching. Finally, we discuss ML-integrated workflows for chemical formula identification. By addressing the limitations of current methods for compound identification, these ML approaches can greatly enhance the understanding of biological processes and the development of diagnostic and therapeutic tools.
more »
« less
Biological mass spectrometry enables spatiotemporal ‘omics: From tissues to cells to organelles
Biological processes unfold across broad spatial and temporal dimensions, and measurement of the underlying molecular world is essential to their understanding. Interdisciplinary efforts advanced mass spectrometry (MS) into a tour de force for assessing virtually all levels of the molecular architecture, some in exquisite detection sensitivity and scalability in space-time. In this review, we offer vignettes of milestones in technology innovations that ushered sample collection and processing, chemical separation, ionization, and 'omics analyses to progressively finer resolutions in the realms of tissue biopsies and limited cell populations, single cells, and subcellular organelles. Also highlighted are methodologies that empowered the acquisition and analysis of multidimensional MS data sets to reveal proteomes, peptidomes, and metabolomes in ever-deepening coverage in these limited and dynamic specimens. In pursuit of richer knowledge of biological processes, we discuss efforts pioneering the integration of orthogonal approaches from molecular and functional studies, both within and beyond MS. With established and emerging community-wide efforts ensuring scientific rigor and reproducibility, spatiotemporal MS emerged as an exciting and powerful resource to study biological systems in space-time.
more »
« less
- Award ID(s):
- 1832968
- PAR ID:
- 10422240
- Date Published:
- Journal Name:
- Mass Spectrometry Reviews
- ISSN:
- 0277-7037
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Sweedler, J. V.; Eberwine, J. H.; Fraser, S. E. (Ed.)Molecular composition is intricately intertwined with cellular function, and elucidation of this relationship is essential for understanding life processes and developing next-generational therapeutics. Technological innovations in capillary electrophoresis (CE) and liquid chromatography (LC)-mass spectrometry (MS) provide previously unavailable insights into cellular biochemistry by allowing for the unbiased detection and quantification of molecules with high specificity. This chapter presents our validated protocols integrating ultrasensitive MS with classical tools of cell, developmental, and neurobiology to assess the biological function of important biomolecules. We use CE- and LC-MS to measure hundreds of metabolites and thousands of proteins in single cells or limited populations of tissues in chordate embryos and mammalian neurons, revealing molecular heterogeneity between identified cells. By pairing microinjection and optical microscopy, we enable cell lineage tracing and testing of the roles that dysregulated molecules play in the formation and maintenance of cell heterogeneity and tissue specification in frog embryos (Xenopus laevis). Electrophysiology extends our workflows to characterizing neuronal activity in sections of mammalian brain tissues. The information obtained from these studies mutually strengthen chemistry and biology and highlight the importance of interdisciplinary research to advance basic knowledge and translational applications forward.more » « less
-
null (Ed.)Glycans are one of the most widely investigated biomolecules, due to their roles in numerous vital biological processes. However, few system-independent, LC-MS/MS (liquid chromatography tandem mass spectrometry) based studies have been developed with this particular goal. Standard approaches generally rely on normalized retention times as well as m/z-mass to charge ratios of ion values. Due to these limitations, there is need for quantitative characterization methods which can be used independently of m/z values, thus utilizing only normalized retention times. As such, the primary goal of this article is to construct an LC-MS/MS based classification of the glycans derived from standard glycoproteins and human blood serum using a glucose unit index as the reference frame in the space of compound parameters. For the reference frame, we develop a closed-form analytic formula via the Green's function of a relevant convection-diffusion-absorption equation used to model composite material transport. The aforementioned equation is derived from an Einstein–Brownian motion paradigm, which provides a physical interpretation of the time-dependence at the point of observation for molecular transport in the experiment. The necessary coefficients are determined via a data-driven learning procedure. The methodology is presented in an abstractly and validated via comparison with experimental mass spectrometer data.more » « less
-
null (Ed.)Global loss of biodiversity and its associated ecosystem services is occurring at an alarming rate and is predicted to accelerate in the future. Metacommunity theory provides a framework to investigate multi-scale processes that drive change in biodiversity across space and time. Short-term ecological studies across space have progressed our understanding of biodiversity through a metacommunity lens, however, such snapshots in time have been limited in their ability to explain which processes, at which scales, generate observed spatial patterns. Temporal dynamics of metacommunities have been understudied, and large gaps in theory and empirical data have hindered progress in our understanding of underlying metacommunity processes that give rise to biodiversity patterns. Fortunately, we are at an important point in the history of ecology, where long-term studies with cross-scale spatial replication provide a means to gain a deeper understanding of the multiscale processes driving biodiversity patterns in time and space to inform metacommunity theory. The maturation of coordinated research and observation networks, such as the United States Long Term Ecological Research (LTER) program, provides an opportunity to advance explanation and prediction of biodiversity change with observational and experimental data at spatial and temporal scales greater than any single research group could accomplish. Synthesis of LTER network community datasets illustrates that long-term studies with spatial replication present an under-utilized resource for advancing spatio-temporal metacommunity research. We identify challenges towards synthesizing these data and present recommendations for addressing these challenges. We conclude with insights about how future monitoring efforts by coordinated research and observation networks could further the development of metacommunity theory and its applications aimed at improving conservation efforts.more » « less
-
Amid global challenges like climate change, extinctions, and disease epidemics, science and society require nuanced, international solutions that are grounded in robust, interdisciplinary perspectives and datasets that span deep time. Natural history collections, from modern biological specimens to the archaeological and fossil records, are crucial tools for understanding cultural and biological processes that shape our modern world. At the same time, natural history collections in low and middle-income countries are at-risk and underresourced, imperiling efforts to build the infrastructure and scientific capacity necessary to tackle critical challenges. The case of Mongolia exemplifies the unique challenges of preserving natural history collections in a country with limited financial resources under the thumb of scientific colonialism. Specifically, the lack of biorepository infrastructure throughout Mongolia stymies efforts to study or respond to large-scale environmental changes of the modern era. Investment in museum capacity and training to develop locally-accessible collections that characterize natural communities over time and space must be a key priority for a future where understanding climate scenarios, predicting, and responding to zoonotic disease, making informed conservation choices, or adapting to agricultural challenges, will be all but impossible without relevant and accessible collections.more » « less
An official website of the United States government

