skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A web-based system for creating, viewing, and editing precursor mass spectrometry ground truth data
Abstract Background Mass spectrometry (MS) uses mass-to-charge ratios of measured particles to decode the identities and quantities of molecules in a sample. Interpretation of raw MS depends upon data processing algorithms that render it human-interpretable. Quantitative MS workflows are complex experimental chains and it is crucial to know the performance and bias of each data processing method as they impact accuracy, coverage, and statistical significance of the result. Creation of the ground truth necessary for quantitatively evaluating MS1-aware algorithms is difficult and tedious task, and better software for creating such datasets would facilitate more extensive evaluation and improvement of MS data processing algorithms. Results We present JS-MS 2.0, a software suite that provides a dependency-free, browser-based, one click, cross-platform solution for creating MS1 ground truth. The software retains the first version’s capacity for loading, viewing, and navigating MS1 data in 2- and 3-D, and adds tools for capturing, editing, saving, and viewing isotopic envelope and extracted isotopic chromatogram features. The software can also be used to view and explore the results of feature finding algorithms. Conclusions JS-MS 2.0 enables faster creation and inspection of MS1 ground truth data. It is publicly available with an MIT license at github.com/optimusmoose/jsms.  more » « less
Award ID(s):
1723248 1723006 1723196
PAR ID:
10222832
Author(s) / Creator(s):
;
Date Published:
Journal Name:
BMC Bioinformatics
Volume:
21
Issue:
1
ISSN:
1471-2105
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An inherent strength of hydrogen/deuterium exchange coupled to mass spectrometry (HDX-MS) is its ability to detect the presence of multiple conformational states of a protein, which often manifest as multimodal isotopic envelopes. However, the statistical considerations for accurate analysis of multimodal spectra have yet to be established. Here we outline an unrestrained binomial distribution fitting approach with the corresponding statistical tests to accurately detect and, when possible, deconvolute isotopic distributions that contain multiple subpopulations. The algorithms have been incorporated into an updated version of the freely available software, HX-Express, and validated using known mixtures of peptides deuterated to varying degrees. This approach presents a readily accessible tool to fit and interpret bimodal and trimodal behavior in HDX-MS data for mixed populations, EX1 kinetics, and pulse labeling data. 
    more » « less
  2. A broad range of research fields benefit from the information extracted from naturalistic audio data. Speech research typically relies on the availability of human-generated metadata tags to comprise a set of “ground truth” labels for the development of speech processing algorithms. While the manual generation of metadata tags may be feasible on a small scale, unique problems arise when creating speech resources for massive, naturalistic audio data. This paper presents a general discussion on these challenges and highlights suggestions when creating metadata for speech resources that are intended to be useful both in speech research and in other fields. Further, it provides an overview of how the task of creating a speech resource for various communities has been and is continuing to be approached for the massive corpus of audio from the historic NASA Apollo missions, which includes tens of thousands of hours of naturalistic, team-based audio data featuring numerous speakers across multiple points in history. 
    more » « less
  3. Laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) imaging has been extensively used to determine the distributions of metals in biological tissues for a wide variety of applications. To be useful for identifying metal biodistributions, the acquired raw data needs to be reconstructed into a two-dimensional image. Several approaches have been developed for LA-ICP-MS image reconstruction, but less focus has been placed on software for more in-depth statistical processing of the imaging data. Yet, improved image processing can allow the biological ramifications of metal distributions in tissues to be better understood. In this work, we describe software written in Python that automatically reconstructs, analyzes, and segments images from LA-ICP-MS imaging data. Image segmentation is achieved using LA-ICP-MS signals from the biological metals Fe and Zn together with k -means clustering to automatically identify sub-organ regions in different tissues. Spatial awareness also can be incorporated into the images through a neighboring pixel evaluation that allows regions of interest to be identified that are at the limit of the LA-ICP-MS imaging resolution. The value of the described algorithms is demonstrated for LA-ICP-MS images of nanomaterial biodistributions. The developed image reconstruction and processing approach reveals that nanomaterials distribute in different sub-organ regions based on their chemical and physical properties, opening new possibilities for understanding the impact of such nanomaterials in vivo . 
    more » « less
  4. Recent advances in native mass spectrometry (MS) and denatured intact protein MS have made these techniques essential for biotherapeutic characterization. As MS analysis has increased in throughput and scale, new data analysis workflows are needed to provide rapid quantitation from large datasets. Here, we describe the UniDec Processing Pipeline (UPP) for the analysis of batched biotherapeutic intact MS data. UPP is built into the UniDec software package, which provides fast processing, deconvolution, and peak detection. The user and programming interfaces for UPP read a spreadsheet that contains the data file names, deconvolution parameters, and quantitation settings. After iterating through the spreadsheet and analyzing each file, it returns a spreadsheet of results and HTML reports. We demonstrate the use of UPP to measure correct pairing percentage on a set of bispecific antibody data and to measure drug-to-antibody ratios from antibody-drug conjugates. Moreover, because the software is free and open-source, users can easily build on this platform to create customized workflows and calculations. Thus, UPP provides a flexible workflow that can be deployed in diverse settings and for a wide range of biotherapeutic applications. 
    more » « less
  5. Thomasson, J. Alex; Torres-Rua, Alfonso F. (Ed.)
    sUAS (small-Unmanned Aircraft System) and advanced surface energy balance models allow detailed assessment and monitoring (at plant scale) of different (agricultural, urban, and natural) environments. Significant progress has been made in the understanding and modeling of atmosphere-plant-soil interactions and numerical quantification of the internal processes at plant scale. Similarly, progress has been made in ground truth information comparison and validation models. An example of this progress is the application of sUAS information using the Two-Source Surface Energy Balance (TSEB) model in commercial vineyards by the Grape Remote sensing Atmospheric Profile and Evapotranspiration eXperiment - GRAPEX Project in California. With advances in frequent sUAS data collection for larger areas, sUAS information processing becomes computationally expensive on local computers. Additionally, fragmentation of different models and tools necessary to process the data and validate the results is a limiting factor. For example, in the referred GRAPEX project, commercial software (ArcGIS and MS Excel) and Python and Matlab code are needed to complete the analysis. There is a need to assess and integrate research conducted with sUAS and surface energy balance models in a sharing platform to be easily migrated to high performance computing (HPC) resources. This research, sponsored by the National Science Foundation FAIR Cyber Training Fellowships, is integrating disparate software and code under a unified language (Python). The Python code for estimating the surface energy fluxes using TSEB2T model as well as the EC footprint analysis code for ground truth information comparison were hosted in myGeoHub site https://mygeohub.org/ to be reproducible and replicable. 
    more » « less