skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

This content will become publicly available on February 28, 2025

Title: IsoForma: An R Package for Quantifying and Visualizing Positional Isomers in Top-Down LC-MS/MS Data
Proteoforms, the different forms of a protein with sequence variations including post-translational modifications (PTMs), execute vital functions in biological systems, such as cell signaling and epigenetic regulation. Advances in top-down mass spectrometry (MS) technology have permitted the direct characterization of intact proteoforms and their exact number of modification sites, allowing for the relative quantification of positional isomers (PI). Protein positional isomers refer to a set of proteoforms with identical total mass and set of modifications, but varying PTM site combinations. The relative abundance of PI can be estimated by matching proteoform-specific fragment ions to top-down tandem MS (MS2) data to localize and quantify modifications. However, the current approaches heavily rely on manual annotation. Here, we present IsoForma, an open-source R package for the relative quantification of PI within a single tool. Benchmarking IsoForma's performance against two existing workflows produced comparable results and improvements in speed. Overall, IsoForma provides a streamlined process for quantifying PI, reduces the analysis time, and offers an essential framework for developing customized proteoform analysis workflows. The software is open source and available at  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
American Chemical Society
Date Published:
Journal Name:
Journal of Proteome Research
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mass spectrometry (MS)-based spatially resolved top-down proteomics (TDP) of tissues is crucial for understanding the roles played by microenvironmental heterogeneity in the biological functions of organs and for discovering new proteoform biomarkers of diseases. There are few published spatially resolved TDP studies. One of the challenges relates to the limited performance of TDP for the analysis of spatially isolated samples using, for example, laser capture microdissection (LCM) because those samples are usually mass-limited. We present the first pilot study of LCM-capillary zone electrophoresis (CZE)-MS/MS for spatially resolved TDP and used zebrafish brain as the sample. The LCM-CZE-MS/MS platform employed a non-ionic detergent and a freeze–thaw method for efficient proteoform extraction from LCM isolated brain sections followed by CZE-MS/MS without any sample cleanup step, ensuring high sensitivity. Over 400 proteoforms were identified in a CZE-MS/MS analysis of one LCM brain section via consuming the protein content of roughly 250 cells. We observed drastic differences in proteoform profiles between two LCM brain sections isolated from the optic tectum (Teo) and telencephalon (Tel) regions. Proteoforms of three proteins (npy, penkb, and pyya) having neuropeptide hormone activity were exclusively identified in the isolated Tel section. Proteoforms of reticulon, myosin, and troponin were almost exclusively identified in the isolated Teo section, and those proteins play essential roles in visual and motor activities. The proteoform profiles accurately reflected the main biological functions of the Teo and Tel regions of the brain. Additionally, hundreds of post-translationally modified proteoforms were identified. 
    more » « less
  2. Mass spectrometry (MS)-based denaturing top-down proteomics (dTDP) identify proteoforms without pretreatment of enzyme proteolysis. A universal sample preparation method that can efficiently extract protein, reduce sample loss, maintain protein solubility, and be compatible with following up liquid-phase separation, MS, and tandem MS (MS/MS) is vital for large-scale proteoform characterization. Membrane ultrafiltration (MU) was employed here for buffer exchange to efficiently remove the sodium dodecyl sulfate (SDS) detergent in protein samples used for protein extraction and solubilization, followed by capillary zone electrophoresis (CZE)-MS/MS analysis. The MU method showed good protein recovery, minimum protein bias, and nice compatibility with CZE-MS/MS. Single-shot CZE-MS/MS analysis of an Escherichia coli sample prepared by the MU method identified over 800 proteoforms. 
    more » « less
  3. Abstract

    We present a large‐scale top‐down proteomics (TDP) study of plant leaf and chloroplast proteins, achieving the identification of over 4700 unique proteoforms. Using capillary zone electrophoresis coupled with tandem mass spectrometry analysis of offline size‐exclusion chromatography fractions, we identify 3198 proteoforms for total leaf and 1836 proteoforms for chloroplast, with 1024 and 363 proteoforms having post‐translational modifications, respectively. The electrophoretic mobility prediction of capillary zone electrophoresis allowed us to validate post‐translational modifications that impact the charge state such as acetylation and phosphorylation. Identified modifications included Trp (di)oxidation events on six chloroplast proteins that may represent novel targets of singlet oxygen sensing. Furthermore, our TDP data provides direct experimental evidence of the N‐ and C‐terminal residues of numerous mature proteoforms from chloroplast, mitochondria, endoplasmic reticulum, and other sub‐cellular localizations. With this information, we suggest true transit peptide cleavage sites and correct sub‐cellular localization signal predictions. This large‐scale analysis illustrates the power of top‐down proteoform identification of post‐translational modifications and intact sequences that can benefit our understanding of both the structure and function of hundreds of plant proteins.

    more » « less
  4. Abstract

    Characterization of histone proteoforms with various post‐translational modifications (PTMs) is critical for a better understanding of functions of histone proteoforms in epigenetic control of gene expression. Mass spectrometry (MS)‐based top‐down proteomics (TDP) is a valuable approach for delineating histone proteoforms because it can provide us with a bird's‐eye view of histone proteoforms carrying diverse combinations of PTMs. Here, we present the first example of coupling capillary zone electrophoresis (CZE), ion mobility spectrometry (IMS), and MS for online multi‐dimensional separations of histone proteoforms. Our CZE‐high‐field asymmetric waveform IMS (FAIMS)‐MS/MS platform identified 366 (ProSight PD) and 602 (TopPIC) histone proteoforms from a commercial calf histone sample using a low microgram amount of histone sample as the starting material. CZE‐FAIMS‐MS/MS improved the number of histone proteoform identifications by about 3 folds compared to CZE‐MS/MS alone (without FAIMS). The results indicate that CZE‐FAIMS‐MS/MS could be a useful tool for comprehensive characterization of histone proteoforms with high sensitivity.

    more » « less
  5. null (Ed.)
    Hypertrophic cardiomyopathy (HCM) is the most common heritable heart disease. Although the genetic cause of HCM has been linked to mutations in genes encoding sarcomeric proteins, the ability to predict clinical outcomes based on specific mutations in HCM patients is limited. Moreover, how mutations in different sarcomeric proteins can result in highly similar clinical phenotypes remains unknown. Posttranslational modifications (PTMs) and alternative splicing regulate the function of sarcomeric proteins; hence, it is critical to study HCM at the level of proteoforms to gain insights into the mechanisms underlying HCM. Herein, we employed high-resolution mass spectrometry–based top-down proteomics to comprehensively characterize sarcomeric proteoforms in septal myectomy tissues from HCM patients exhibiting severe outflow track obstruction ( n = 16) compared to nonfailing donor hearts ( n = 16). We observed a complex landscape of sarcomeric proteoforms arising from combinatorial PTMs, alternative splicing, and genetic variation in HCM. A coordinated decrease of phosphorylation in important myofilament and Z-disk proteins with a linear correlation suggests PTM cross-talk in the sarcomere and dysregulation of protein kinase A pathways in HCM. Strikingly, we discovered that the sarcomeric proteoform alterations in the myocardium of HCM patients undergoing septal myectomy were remarkably consistent, regardless of the underlying HCM-causing mutations. This study suggests that the manifestation of severe HCM coalesces at the proteoform level despite distinct genotype, which underscores the importance of molecular characterization of HCM phenotype and presents an opportunity to identify broad-spectrum treatments to mitigate the most severe manifestations of this genetically heterogenous disease. 
    more » « less