skip to main content


Title: Metaproteomics as a tool for studying the protein landscape of human-gut bacterial species
Host-microbiome interactions and the microbial community have broad impact in human health and diseases. Most microbiome based studies are performed at the genome level based on next-generation sequencing techniques, but metaproteomics is emerging as a powerful technique to study microbiome functional activity by characterizing the complex and dynamic composition of microbial proteins. We conducted a large-scale survey of human gut microbiome metaproteomic data to identify generalist species that are ubiquitously expressed across all samples and specialists that are highly expressed in a small subset of samples associated with a certain phenotype. We were able to utilize the metaproteomic mass spectrometry data to reveal the protein landscapes of these species, which enables the characterization of the expression levels of proteins of different functions and underlying regulatory mechanisms, such as operons. Finally, we were able to recover a large number of open reading frames (ORFs) with spectral support, which were missed by de novo protein-coding gene predictors. We showed that a majority of the rescued ORFs overlapped with de novo predicted protein-coding genes, but on opposite strands or in different frames. Together, these demonstrate applications of metaproteomics for the characterization of important gut bacterial species.  more » « less
Award ID(s):
2025451
NSF-PAR ID:
10355919
Author(s) / Creator(s):
; ;
Editor(s):
Coelho, Luis Pedro
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
18
Issue:
3
ISSN:
1553-7358
Page Range / eLocation ID:
e1009397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Background A few recent large efforts significantly expanded the collection of human-associated bacterial genomes, which now contains thousands of entities including reference complete/draft genomes and metagenome assembled genomes (MAGs). These genomes provide useful resource for studying the functionality of the human-associated microbiome and their relationship with human health and diseases. One application of these genomes is to provide a universal reference for database search in metaproteomic studies, when matched metagenomic/metatranscriptomic data are unavailable. However, a greater collection of reference genomes may not necessarily result in better peptide/protein identification because the increase of search space often leads to fewer spectrum-peptide matches, not to mention the drastic increase of computation time. Methods Here, we present a new approach that uses two steps to optimize the use of the reference genomes and MAGs as the universal reference for human gut metaproteomic MS/MS data analysis. The first step is to use only the high-abundance proteins (HAPs) (i.e., ribosomal proteins and elongation factors) for metaproteomic MS/MS database search and, based on the identification results, to derive the taxonomic composition of the underlying microbial community. The second step is to expand the search database by including all proteins from identified abundant species. We call our approach HAPiID (HAPs guided metaproteomics IDentification). Results We tested our approach using human gut metaproteomic datasets from a previous study and compared it to the state-of-the-art reference database search method MetaPro-IQ for metaproteomic identification in studying human gut microbiota. Our results show that our two-steps method not only performed significantly faster but also was able to identify more peptides. We further demonstrated the application of HAPiID to revealing protein profiles of individual human-associated bacterial species, one or a few species at a time, using metaproteomic data. Conclusions The HAP guided profiling approach presents a novel effective way for constructing target database for metaproteomic data analysis. The HAPiID pipeline built upon this approach provides a universal tool for analyzing human gut-associated metaproteomic data. 
    more » « less
  2. Metaproteomics is a powerful tool for the characterization of metabolism, physiology, and functional interactions in microbial communities, including plant-associated microbiota. However, the metaproteomic methods that have been used to study plant-associated microbiota are very laborious and require large amounts of plant tissue, hindering wider application of these methods. We optimized and evaluated different protein extraction methods for metaproteomics of plant-associated microbiota in two different plant species ( Arabidopsis and maize). Our main goal was to identify a method that would work with low amounts of input material (40 to 70 mg) and that would maximize the number of identified microbial proteins. We tested eight protocols, each comprising a different combination of physical lysis method, extraction buffer, and cell-enrichment method on roots from plants grown with synthetic microbial communities. We assessed the performance of the extraction protocols by liquid chromatography-tandem mass spectrometry–based metaproteomics and found that the optimal extraction method differed between the two species. For Arabidopsis roots, protein extraction by beating whole roots with small beads provided the greatest number of identified microbial proteins and improved the identification of proteins from gram-positive bacteria. For maize, vortexing root pieces in the presence of large glass beads yielded the greatest number of microbial proteins identified. Based on these data, we recommend the use of these two methods for metaproteomics with Arabidopsis and maize. Furthermore, detailed descriptions of the eight tested protocols will enable future optimization of protein extraction for metaproteomics in other dicot and monocot plants. [Formula: see text] Copyright © 2022 The Author(s). This is an open access article distributed under the CC BY 4.0 International license . 
    more » « less
  3. Abstract Background

    Stable isotope probing (SIP) approaches are a critical tool in microbiome research to determine associations between species and substrates, as well as the activity of species. The application of these approaches ranges from studying microbial communities important for global biogeochemical cycling to host-microbiota interactions in the intestinal tract. Current SIP approaches, such as DNA-SIP or nanoSIMS allow to analyze incorporation of stable isotopes with high coverage of taxa in a community and at the single cell level, respectively, however they are limited in terms of sensitivity, resolution or throughput.

    Results

    Here, we present an ultra-sensitive, high-throughput protein-based stable isotope probing approach (Protein-SIP), which cuts cost for labeled substrates by 50–99% as compared to other SIP and Protein-SIP approaches and thus enables isotope labeling experiments on much larger scales and with higher replication. The approach allows for the determination of isotope incorporation into microbiome members with species level resolution using standard metaproteomics liquid chromatography-tandem mass spectrometry (LC–MS/MS) measurements. At the core of the approach are new algorithms to analyze the data, which have been implemented in an open-source software (https://sourceforge.net/projects/calis-p/). We demonstrate sensitivity, precision and accuracy using bacterial cultures and mock communities with different labeling schemes. Furthermore, we benchmark our approach against two existing Protein-SIP approaches and show that in the low labeling range used our approach is the most sensitive and accurate. Finally, we measure translational activity using18O heavy water labeling in a 63-species community derived from human fecal samples grown on media simulating two different diets. Activity could be quantified on average for 27 species per sample, with 9 species showing significantly higher activity on a high protein diet, as compared to a high fiber diet. Surprisingly, among the species with increased activity on high protein were severalBacteroidesspecies known as fiber consumers. Apparently, protein supply is a critical consideration when assessing growth of intestinal microbes on fiber, including fiber-based prebiotics.

    Conclusions

    We demonstrate that our Protein-SIP approach allows for the ultra-sensitive (0.01 to 10% label) detection of stable isotopes of elements found in proteins, using standard metaproteomics data.

     
    more » « less
  4. Gralnick, Jeffrey A. (Ed.)
    ABSTRACT Field studies are central to environmental microbiology and microbial ecology, because they enable studies of natural microbial communities. Metaproteomics, the study of protein abundances in microbial communities, allows investigators to study these communities “ in situ ,” which requires protein preservation directly in the field because protein abundance patterns can change rapidly after sampling. Ideally, a protein preservative for field deployment works rapidly and preserves the whole proteome, is stable in long-term storage, is nonhazardous and easy to transport, and is available at low cost. Although these requirements might be met by several protein preservatives, an assessment of their suitability under field conditions when targeted for metaproteomic analyses is currently lacking. Here, we compared the protein preservation performance of flash freezing and the preservation solution RNA later using the marine gutless oligochaete Olavius algarvensis and its symbiotic microbes as a test case. In addition, we evaluated long-term RNA later storage after 1 day, 1 week, and 4 weeks at room temperature (22°C to 23°C). We evaluated protein preservation using one-dimensional liquid chromatography-tandem mass spectrometry. We found that RNA later and flash freezing preserved proteins equally well in terms of total numbers of identified proteins and relative abundances of individual proteins, and none of the test time points was altered, compared to time zero. Moreover, we did not find biases against specific taxonomic groups or proteins with particular biochemical properties. Based on our metaproteomic data and the logistical requirements for field deployment, we recommend RNA later for protein preservation of field-collected samples targeted for metaproteomic analyses. IMPORTANCE Metaproteomics, the large-scale identification and quantification of proteins from microbial communities, provide direct insights into the phenotypes of microorganisms on the molecular level. To ensure the integrity of the metaproteomic data, samples need to be preserved immediately after sampling to avoid changes in protein abundance patterns. In laboratory setups, samples for proteomic analyses are most commonly preserved by flash freezing; however, liquid nitrogen or dry ice is often unavailable at remote field locations, due to their hazardous nature and transport restrictions. Our study shows that RNA later can serve as a low-hazard, easy-to-transport alternative to flash freezing for field preservation of samples for metaproteomic analyses. We show that RNA later preserves the metaproteome equally well, compared to flash freezing, and protein abundance patterns remain stable during long-term storage for at least 4 weeks at room temperature. 
    more » « less
  5. Ranaviruses (Iridoviridae), including Frog Virus 3 (FV3), are large dsDNA viruses that cause devastating infections globally in amphibians, fish, and reptiles, and contribute to catastrophic amphibian declines. FV3’s large genome (~105 kb) contains at least 98 putative open reading frames (ORFs) as annotated in its reference genome. Previous studies have classified these coding genes into temporal classes as immediate early, delayed early, and late viral transcripts based on their sequential expression during FV3 infection. To establish a high-throughput characterization of ranaviral gene expression at the genome scale, we performed a whole transcriptomic analysis (RNA-Seq) using total RNA samples containing both viral and cellular transcripts from FV3-infected Xenopus laevis adult tissues using two FV3 strains, a wild type (FV3-WT) and an ORF64R-deleted recombinant (FV3-∆64R). In samples from the infected intestine, liver, spleen, lung, and especially kidney, an FV3-targeted transcriptomic analysis mapped reads spanning the full-genome coverage at ~10× depth on both positive and negative strands. By contrast, reads were only mapped to partial genomic regions in samples from the infected thymus, skin, and muscle. Extensive analyses validated the expression of almost all of the 98 annotated ORFs and profiled their differential expression in a tissue-, virus-, and temporal class-dependent manner. Further studies identified several putative ORFs that encode hypothetical proteins containing viral mimicking conserved domains found in host interferon (IFN) regulatory factors (IRFs) and IFN receptors. This study provides the first comprehensive genome-wide viral transcriptome profiling during infection and across multiple amphibian host tissues that will serve as an instrumental reference. Our findings imply that Ranaviruses like FV3 have acquired previously unknown molecular mimics, interfering with host IFN signaling during evolution. 
    more » « less