skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A crustacean neuropeptide spectral library for data‐independent acquisition (DIA) mass spectrometry applications
Abstract Neuropeptides have tremendous potential for application in modern medicine, including utility as biomarkers and therapeutics. To overcome the inherent challenges associated with neuropeptide identification and characterization, data‐independent acquisition (DIA) is a fitting mass spectrometry (MS) method of choice to achieve sensitive and accurate analysis. It is advantageous for preliminary neuropeptidomic studies to occur in less complex organisms, with crustacean models serving as a popular choice due to their relatively simple nervous system. With spectral libraries serving as a means to interpret DIA‐MS output spectra, andCancer borealisas a model of choice for neuropeptide analysis, we performed the first spectral library mapping of crustacean neuropeptides. Leveraging pre‐existing data‐dependent acquisition (DDA) spectra, a spectral library was built using PEAKS Online. The library is comprised of 333 unique neuropeptides. The identification results obtained through the use of this spectral library were compared with those achieved through library‐free analysis of crustacean brain, pericardial organs (PO), and thoracic ganglia (TG) tissues. A statistically significant increase (Student'st‐test,Pvalue < 0.05) in the number of identifications achieved from the TG data was observed in the spectral library results. Furthermore, in each of the tissues, a distinctly different set of identifications was found in the library search compared to the library‐free search. This work highlights the necessity for the use of spectral libraries in neuropeptide analysis, illustrating the advantage of spectral libraries for interpreting DIA spectra in a reproducible manner with greater neuropeptidomic depth.  more » « less
Award ID(s):
2108223
PAR ID:
10484131
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
PROTEOMICS
Volume:
24
Issue:
15
ISSN:
1615-9853
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Data-Independent Acquisition (DIA) is a method to improve consistent identification and precise quantitation of peptides and proteins by mass spectrometry (MS). The targeted data analysis strategy in DIA relies on spectral assay libraries that are generally derived from a priori measurements of peptides for each species. Although Escherichia coli ( E. coli ) is among the best studied model organisms, so far there is no spectral assay library for the bacterium publicly available. Here, we generated a spectral assay library for 4,014 of the 4,389 annotated E. coli proteins using one- and two-dimensional fractionated samples, and ion mobility separation enabling deep proteome coverage. We demonstrate the utility of this high-quality library with robustness in quantitation of the E. coli proteome and with rapid-chromatography to enhance throughput by targeted DIA-MS. The spectral assay library supports the detection and quantification of 91.5% of all E. coli proteins at high-confidence with 56,182 proteotypic peptides, making it a valuable resource for the scientific community. Data and spectral libraries are available via ProteomeXchange (PXD020761, PXD020785) and SWATHAtlas (SAL00222-28). 
    more » « less
  2. Abstract Data-Independent Acquisition (DIA) is a mass spectrometry-based method to reliably identify and reproducibly quantify large fractions of a target proteome. The peptide-centric data analysis strategy employed in DIA requiresa priorigenerated spectral assay libraries. Such assay libraries allow to extract quantitative data in a targeted approach and have been generated for human, mouse, zebrafish,E. coliand few other organisms. However, a spectral assay library for the extreme halophilic archaeonHalobacterium salinarumNRC-1, a model organism that contributed to several notable discoveries, is not publicly available yet. Here, we report a comprehensive spectral assay library to measure 2,563 of 2,646 annotatedH. salinarumNRC-1 proteins. We demonstrate the utility of this library by measuring global protein abundances over time under standard growth conditions. TheH. salinarumNRC-1 library includes 21,074 distinct peptides representing 97% of the predicted proteome and provides a new, valuable resource to confidently measure and quantify any protein of this archaeon. Data and spectral assay libraries are available via ProteomeXchange (PXD042770, PXD042774) and SWATHAtlas (SAL00312-SAL00319). 
    more » « less
  3. Abstract MotivationTandem mass spectrometry (MS/MS) is a crucial technology for large-scale proteomic analysis. The protein database search or the spectral library search are commonly used for peptide identification from MS/MS spectra, which, however, may face challenges due to experimental variations between replicated spectra and similar fragmentation patterns among distinct peptides. To address this challenge, we present SpecEncoder, a deep metric learning approach to address these challenges by transforming MS/MS spectra into robust and sensitive embedding vectors in a latent space. The SpecEncoder model can also embed predicted MS/MS spectra of peptides, enabling a hybrid search approach that combines spectral library and protein database searches for peptide identification. ResultsWe evaluated SpecEncoder on three large human proteomics datasets, and the results showed a consistent improvement in peptide identification. For spectral library search, SpecEncoder identifies 1%–2% more unique peptides (and PSMs) than SpectraST. For protein database search, it identifies 6%–15% more unique peptides than MSGF+ enhanced by Percolator, Furthermore, SpecEncoder identified 6%–12% additional unique peptides when utilizing a combined library of experimental and predicted spectra. SpecEncoder can also identify more peptides when compared to deep-learning enhanced methods (MSFragger boosted by MSBooster). These results demonstrate SpecEncoder’s potential to enhance peptide identification for proteomic data analyses. Availability and ImplementationThe source code and scripts for SpecEncoder and peptide identification are available on GitHub at https://github.com/lkytal/SpecEncoder. Contact: hatang@iu.edu. 
    more » « less
  4. Tandem mass spectrometry (MS/MS) is crucial for small-molecule analysis; however, traditional computational methods are limited by incomplete reference libraries and complex data processing. Machine learning (ML) is transforming small-molecule mass spectrometry in three key directions: (a) predicting MS/MS spectra and related physicochemical properties to expand reference libraries, (b) improving spectral matching through automated pattern extraction, and (c) predicting molecular structures of compounds directly from their MS/MS spectra. We review ML approaches for molecular representations [descriptors, simplified molecular-input line-entry (SMILE) strings, and graphs] and MS/MS spectra representations (using binned vectors and peak lists) along with recent advances in spectra prediction, retention time, collision cross sections, and spectral matching. Finally, we discuss ML-integrated workflows for chemical formula identification. By addressing the limitations of current methods for compound identification, these ML approaches can greatly enhance the understanding of biological processes and the development of diagnostic and therapeutic tools. 
    more » « less
  5. Abstract Glycosylated neuropeptides were recently discovered in crustaceans, a model organism with a well‐characterized neuroendocrine system. Several workflows exist to characterize enzymatically digested peptides; however, the unique properties of endogenous neuropeptides require methods to be re‐evaluated. We investigate the use of hydrophilic interaction liquid chromatography (HILIC) enrichment and different fragmentation methods to further probe the expression of glycosylated neuropeptides inCallinectes sapidus. During the evaluation of HILIC, we observed the necessity of a less aqueous solvent for endogenous peptide samples. This modification enabled the number of detected neuropeptide glycoforms to increase almost two‐fold, from 18 to 36. Product ion‐triggered electron‐transfer/higher‐energy collision dissociation enabled the site‐specific detection of 55 intact N‐ and O‐linked glycoforms, while the faster stepped collision energy higher‐energy collisional dissociation resulted in detection of 25. Additionally, applying this workflow to five neuronal tissues enabled the characterization of 36 more glycoforms of known neuropeptides and 11 more glycoforms of nine putative novel neuropeptides. Overall, the database of glycosylated neuropeptides in crustaceans was largely expanded from 18 to 136 glycoforms of 40 neuropeptides from 10 neuropeptide families. Both macro‐ and micro‐heterogeneity were observed, demonstrating the chemical diversity of this simple invertebrate, establishing a framework to use crustacean to probe modulatory effects of glycosylation on neuropeptides. 
    more » « less