skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: IsoForma: An R Package for Quantifying and Visualizing Positional Isomers in Top-Down LC-MS/MS Data
Proteoforms, the different forms of a protein with sequence variations including post-translational modifications (PTMs), execute vital functions in biological systems, such as cell signaling and epigenetic regulation. Advances in top-down mass spectrometry (MS) technology have permitted the direct characterization of intact proteoforms and their exact number of modification sites, allowing for the relative quantification of positional isomers (PI). Protein positional isomers refer to a set of proteoforms with identical total mass and set of modifications, but varying PTM site combinations. The relative abundance of PI can be estimated by matching proteoform-specific fragment ions to top-down tandem MS (MS2) data to localize and quantify modifications. However, the current approaches heavily rely on manual annotation. Here, we present IsoForma, an open-source R package for the relative quantification of PI within a single tool. Benchmarking IsoForma's performance against two existing workflows produced comparable results and improvements in speed. Overall, IsoForma provides a streamlined process for quantifying PI, reduces the analysis time, and offers an essential framework for developing customized proteoform analysis workflows. The software is open source and available at https://github.com/EMSL-Computing/isoforma-lib.  more » « less
Award ID(s):
1943439
PAR ID:
10496136
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
American Chemical Society
Date Published:
Journal Name:
Journal of Proteome Research
ISSN:
1535-3893
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mass spectrometry (MS)-based denaturing top-down proteomics (dTDP) identify proteoforms without pretreatment of enzyme proteolysis. A universal sample preparation method that can efficiently extract protein, reduce sample loss, maintain protein solubility, and be compatible with following up liquid-phase separation, MS, and tandem MS (MS/MS) is vital for large-scale proteoform characterization. Membrane ultrafiltration (MU) was employed here for buffer exchange to efficiently remove the sodium dodecyl sulfate (SDS) detergent in protein samples used for protein extraction and solubilization, followed by capillary zone electrophoresis (CZE)-MS/MS analysis. The MU method showed good protein recovery, minimum protein bias, and nice compatibility with CZE-MS/MS. Single-shot CZE-MS/MS analysis of an Escherichia coli sample prepared by the MU method identified over 800 proteoforms. 
    more » « less
  2. Mass spectrometry (MS)-based spatially resolved top-down proteomics (TDP) of tissues is crucial for understanding the roles played by microenvironmental heterogeneity in the biological functions of organs and for discovering new proteoform biomarkers of diseases. There are few published spatially resolved TDP studies. One of the challenges relates to the limited performance of TDP for the analysis of spatially isolated samples using, for example, laser capture microdissection (LCM) because those samples are usually mass-limited. We present the first pilot study of LCM-capillary zone electrophoresis (CZE)-MS/MS for spatially resolved TDP and used zebrafish brain as the sample. The LCM-CZE-MS/MS platform employed a non-ionic detergent and a freeze–thaw method for efficient proteoform extraction from LCM isolated brain sections followed by CZE-MS/MS without any sample cleanup step, ensuring high sensitivity. Over 400 proteoforms were identified in a CZE-MS/MS analysis of one LCM brain section via consuming the protein content of roughly 250 cells. We observed drastic differences in proteoform profiles between two LCM brain sections isolated from the optic tectum (Teo) and telencephalon (Tel) regions. Proteoforms of three proteins (npy, penkb, and pyya) having neuropeptide hormone activity were exclusively identified in the isolated Tel section. Proteoforms of reticulon, myosin, and troponin were almost exclusively identified in the isolated Teo section, and those proteins play essential roles in visual and motor activities. The proteoform profiles accurately reflected the main biological functions of the Teo and Tel regions of the brain. Additionally, hundreds of post-translationally modified proteoforms were identified. 
    more » « less
  3. Proteoforms, which arise from post-translational modifications, genetic polymorphisms and RNA splice variants, play a pivotal role as drivers in biology. Understanding proteoforms is essential to unravel the intricacies of biological systems and bridge the gap between genotypes and phenotypes. By analysing whole proteins without digestion, top-down proteomics (TDP) provides a holistic view of the proteome and can decipher protein function, uncover disease mechanisms and advance precision medicine. This Primer explores TDP, including the underlying principles, recent advances and an outlook on the future. The experimental section discusses instrumentation, sample preparation, intact protein separation, tandem mass spectrometry techniques and data collection. The results section looks at how to decipher raw data, visualize intact protein spectra and unravel data analysis. Additionally, proteoform identification, characterization and quantification are summarized, alongside approaches for statistical analysis. Various applications are described, including the human proteoform project and biomedical, biopharmaceutical and clinical sciences. These are complemented by discussions on measurement reproducibility, limitations and a forward-looking perspective that outlines areas where the field can advance, including potential future applications. 
    more » « less
  4. Abstract We present a large‐scale top‐down proteomics (TDP) study of plant leaf and chloroplast proteins, achieving the identification of over 4700 unique proteoforms. Using capillary zone electrophoresis coupled with tandem mass spectrometry analysis of offline size‐exclusion chromatography fractions, we identify 3198 proteoforms for total leaf and 1836 proteoforms for chloroplast, with 1024 and 363 proteoforms having post‐translational modifications, respectively. The electrophoretic mobility prediction of capillary zone electrophoresis allowed us to validate post‐translational modifications that impact the charge state such as acetylation and phosphorylation. Identified modifications included Trp (di)oxidation events on six chloroplast proteins that may represent novel targets of singlet oxygen sensing. Furthermore, our TDP data provides direct experimental evidence of the N‐ and C‐terminal residues of numerous mature proteoforms from chloroplast, mitochondria, endoplasmic reticulum, and other sub‐cellular localizations. With this information, we suggest true transit peptide cleavage sites and correct sub‐cellular localization signal predictions. This large‐scale analysis illustrates the power of top‐down proteoform identification of post‐translational modifications and intact sequences that can benefit our understanding of both the structure and function of hundreds of plant proteins. 
    more » « less
  5. Multilevel proteomics aims to delineate proteins at the peptide (bottom-up proteomics), proteoform (top-down proteomics), and protein complex (native proteomics) levels. Capillary electrophoresis-mass spectrometry (CE-MS) can achieve highly efficient separation and highly sensitive detection of complex mixtures of peptides, proteoforms, and even protein complexes because of its substantial technical progress. CE-MS has become a valuable alternative to the routinely used liquid chromatography-mass spectrometry for multilevel proteomics. This review summarizes the most recent (2019-2021) advances of CE-MS for multilevel proteomics regarding technological progress and biological applications. We also provide brief perspectives on CE-MS for multilevel proteomics at the end, highlighting some future directions and potential challenges. 
    more » « less