skip to main content


Title: METABOLIC: high-throughput profiling of microbial genomes for functional traits, metabolism, biogeochemistry, and community-scale functional networks
Abstract Background

Advances in microbiome science are being driven in large part due to our ability to study and infer microbial ecology from genomes reconstructed from mixed microbial communities using metagenomics and single-cell genomics. Such omics-based techniques allow us to read genomic blueprints of microorganisms, decipher their functional capacities and activities, and reconstruct their roles in biogeochemical processes. Currently available tools for analyses of genomic data can annotate and depict metabolic functions to some extent; however, no standardized approaches are currently available for the comprehensive characterization of metabolic predictions, metabolite exchanges, microbial interactions, and microbial contributions to biogeochemical cycling.

Results

We present METABOLIC (METabolic And BiogeOchemistry anaLyses In miCrobes), a scalable software to advance microbial ecology and biogeochemistry studies using genomes at the resolution of individual organisms and/or microbial communities. The genome-scale workflow includes annotation of microbial genomes, motif validation of biochemically validated conserved protein residues, metabolic pathway analyses, and calculation of contributions to individual biogeochemical transformations and cycles. The community-scale workflow supplements genome-scale analyses with determination of genome abundance in the microbiome, potential microbial metabolic handoffs and metabolite exchange, reconstruction of functional networks, and determination of microbial contributions to biogeochemical cycles. METABOLIC can take input genomes from isolates, metagenome-assembled genomes, or single-cell genomes. Results are presented in the form of tables for metabolism and a variety of visualizations including biogeochemical cycling potential, representation of sequential metabolic transformations, community-scale microbial functional networks using a newly defined metric “MW-score” (metabolic weight score), and metabolic Sankey diagrams. METABOLIC takes ~ 3 h with 40 CPU threads to process ~ 100 genomes and corresponding metagenomic reads within which the most compute-demanding part of hmmsearch takes ~ 45 min, while it takes ~ 5 h to complete hmmsearch for ~ 3600 genomes. Tests of accuracy, robustness, and consistency suggest METABOLIC provides better performance compared to other software and online servers. To highlight the utility and versatility of METABOLIC, we demonstrate its capabilities on diverse metagenomic datasets from the marine subsurface, terrestrial subsurface, meadow soil, deep sea, freshwater lakes, wastewater, and the human gut.

Conclusion

METABOLIC enables the consistent and reproducible study of microbial community ecology and biogeochemistry using a foundation of genome-informed microbial metabolism, and will advance the integration of uncultivated organisms into metabolic and biogeochemical models. METABOLIC is written in Perl and R and is freely available under GPLv3 athttps://github.com/AnantharamanLab/METABOLIC.

 
more » « less
Award ID(s):
2047598
NSF-PAR ID:
10367464
Author(s) / Creator(s):
; ; ; ; ; ; ;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Microbiome
Volume:
10
Issue:
1
ISSN:
2049-2618
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Microbial communities are essential components of aquatic ecosystems through their contribution to food web dynamics and biogeochemical processes. Aquatic microbial diversity is immense and a general challenge is to understand how metabolism and interactions of single organisms shape microbial community dynamics and ecosystem‐scale biogeochemical transformations. Metagenomic approaches have developed rapidly, and proven to be powerful in linking microbial community dynamics to biogeochemical processes. In this review, we provide an overview of metagenomic approaches, followed by a discussion on some recent insights they have provided, including those in this special issue. These include the discovery of new taxa and metabolisms in aquatic microbiomes, insights into community assembly and functional ecology as well as evolutionary processes shaping microbial genomes and microbiomes, and the influence of human activities on aquatic microbiomes. Given that metagenomics can now be considered a mature technology where data generation and descriptive analyses are relatively routine and informative, we then discuss metagenomic‐enabled research avenues to further link microbial dynamics to biogeochemical processes. These include the integration of metagenomics into well‐designed ecological experiments, the use of metagenomics to inform and validate metabolic and biogeochemical models, and the pressing need for ecologically relevant model organisms and simple microbial systems to better interpret the taxonomic and functional information integrated in metagenomes. These research avenues will contribute to a more mechanistic and predictive understanding of links between microbial dynamics and biogeochemical cycles. Owing to rapid climate change and human impacts on aquatic ecosystems, the urgency of such an understanding has never been greater.

     
    more » « less
  2. Abstract Background

    Microbiomes are now recognized as the main drivers of ecosystem function ranging from the oceans and soils to humans and bioreactors. However, a grand challenge in microbiome science is to characterize and quantify the chemical currencies of organic matter (i.e., metabolites) that microbes respond to and alter. Critical to this has been the development of Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), which has drastically increased molecular characterization of complex organic matter samples, but challenges users with hundreds of millions of data points where readily available, user-friendly, and customizable software tools are lacking.

    Results

    Here, we build on years of analytical experience with diverse sample types to develop MetaboDirect, an open-source, command-line-based pipeline for the analysis (e.g., chemodiversity analysis, multivariate statistics), visualization (e.g., Van Krevelen diagrams, elemental and molecular class composition plots), and presentation of direct injection high-resolution FT-ICR MS data sets after molecular formula assignment has been performed. When compared to other available FT-ICR MS software, MetaboDirect is superior in that it requires a single line of code to launch a fully automated framework for the generation and visualization of a wide range of plots, with minimal coding experience required. Among the tools evaluated, MetaboDirect is also uniquely able to automatically generate biochemical transformation networks (ab initio) based on mass differences (mass difference network-based approach) that provide an experimental assessment of metabolite connections within a given sample or a complex metabolic system, thereby providing important information about the nature of the samples and the set of microbial reactions or pathways that gave rise to them. Finally, for more experienced users, MetaboDirect allows users to customize plots, outputs, and analyses.

    Conclusion

    Application of MetaboDirect to FT-ICR MS-based metabolomic data sets from a marine phage-bacterial infection experiment and aSphagnumleachate microbiome incubation experiment showcase the exploration capabilities of the pipeline that will enable the research community to evaluate and interpret their data in greater depth and in less time. It will further advance our knowledge of how microbial communities influence and are influenced by the chemical makeup of the surrounding system. The source code and User’s guide of MetaboDirect are freely available through (https://github.com/Coayala/MetaboDirect) and (https://metabodirect.readthedocs.io/en/latest/), respectively.

     
    more » « less
  3. Gralnick, Jeffrey A. (Ed.)
    ABSTRACT Reconstructing microbial genomes from metagenomic short-read data can be challenging due to the unknown and uneven complexity of microbial communities. This complexity encompasses highly diverse populations, which often includes strain variants. Reconstructing high-quality genomes is a crucial part of the metagenomic workflow, as subsequent ecological and metabolic inferences depend on their accuracy, quality, and completeness. In contrast to microbial communities in other ecosystems, there has been no systematic assessment of genome-centric metagenomic workflows for drinking water microbiomes. In this study, we assessed the performance of a combination of assembly and binning strategies for time series drinking water metagenomes that were collected over 6 months. The goal of this study was to identify the combination of assembly and binning approaches that result in high-quality and -quantity metagenome-assembled genomes (MAGs), representing most of the sequenced metagenome. Our findings suggest that the metaSPAdes coassembly strategies had the best performance, as they resulted in larger and less fragmented assemblies, with at least 85% of the sequence data mapping to contigs greater than 1 kbp. Furthermore, a combination of metaSPAdes coassembly strategies and MetaBAT2 produced the highest number of medium-quality MAGs while capturing at least 70% of the metagenomes based on read recruitment. Utilizing different assembly/binning approaches also assists in the reconstruction of unique MAGs from closely related species that would have otherwise collapsed into a single MAG using a single workflow. Overall, our study suggests that leveraging multiple binning approaches with different metaSPAdes coassembly strategies may be required to maximize the recovery of good-quality MAGs. IMPORTANCE Drinking water contains phylogenetic diverse groups of bacteria, archaea, and eukarya that affect the esthetic quality of water, water infrastructure, and public health. Taxonomic, metabolic, and ecological inferences of the drinking water microbiome depend on the accuracy, quality, and completeness of genomes that are reconstructed through the application of genome-resolved metagenomics. Using time series metagenomic data, we present reproducible genome-centric metagenomic workflows that result in high-quality and -quantity genomes, which more accurately signifies the sequenced drinking water microbiome. These genome-centric metagenomic workflows will allow for improved taxonomic and functional potential analysis that offers enhanced insights into the stability and dynamics of drinking water microbial communities. 
    more » « less
  4. Abstract Background

    Stable isotope probing (SIP) approaches are a critical tool in microbiome research to determine associations between species and substrates, as well as the activity of species. The application of these approaches ranges from studying microbial communities important for global biogeochemical cycling to host-microbiota interactions in the intestinal tract. Current SIP approaches, such as DNA-SIP or nanoSIMS allow to analyze incorporation of stable isotopes with high coverage of taxa in a community and at the single cell level, respectively, however they are limited in terms of sensitivity, resolution or throughput.

    Results

    Here, we present an ultra-sensitive, high-throughput protein-based stable isotope probing approach (Protein-SIP), which cuts cost for labeled substrates by 50–99% as compared to other SIP and Protein-SIP approaches and thus enables isotope labeling experiments on much larger scales and with higher replication. The approach allows for the determination of isotope incorporation into microbiome members with species level resolution using standard metaproteomics liquid chromatography-tandem mass spectrometry (LC–MS/MS) measurements. At the core of the approach are new algorithms to analyze the data, which have been implemented in an open-source software (https://sourceforge.net/projects/calis-p/). We demonstrate sensitivity, precision and accuracy using bacterial cultures and mock communities with different labeling schemes. Furthermore, we benchmark our approach against two existing Protein-SIP approaches and show that in the low labeling range used our approach is the most sensitive and accurate. Finally, we measure translational activity using18O heavy water labeling in a 63-species community derived from human fecal samples grown on media simulating two different diets. Activity could be quantified on average for 27 species per sample, with 9 species showing significantly higher activity on a high protein diet, as compared to a high fiber diet. Surprisingly, among the species with increased activity on high protein were severalBacteroidesspecies known as fiber consumers. Apparently, protein supply is a critical consideration when assessing growth of intestinal microbes on fiber, including fiber-based prebiotics.

    Conclusions

    We demonstrate that our Protein-SIP approach allows for the ultra-sensitive (0.01 to 10% label) detection of stable isotopes of elements found in proteins, using standard metaproteomics data.

     
    more » « less
  5. Abstract Background Microbial colonization of subsurface shales following hydraulic fracturing offers the opportunity to study coupled biotic and abiotic factors that impact microbial persistence in engineered deep subsurface ecosystems. Shale formations underly much of the continental USA and display geographically distinct gradients in temperature and salinity. Complementing studies performed in eastern USA shales that contain brine-like fluids, here we coupled metagenomic and metabolomic approaches to develop the first genome-level insights into ecosystem colonization and microbial community interactions in a lower-salinity, but high-temperature western USA shale formation. Results We collected materials used during the hydraulic fracturing process (i.e., chemicals, drill muds) paired with temporal sampling of water produced from three different hydraulically fractured wells in the STACK ( S ooner T rend A nadarko Basin, C anadian and K ingfisher) shale play in OK, USA. Relative to other shale formations, our metagenomic and metabolomic analyses revealed an expanded taxonomic and metabolic diversity of microorganisms that colonize and persist in fractured shales. Importantly, temporal sampling across all three hydraulic fracturing wells traced the degradation of complex polymers from the hydraulic fracturing process to the production and consumption of organic acids that support sulfate- and thiosulfate-reducing bacteria. Furthermore, we identified 5587 viral genomes and linked many of these to the dominant, colonizing microorganisms, demonstrating the key role that viral predation plays in community dynamics within this closed, engineered system. Lastly, top-side audit sampling of different source materials enabled genome-resolved source tracking, revealing the likely sources of many key colonizing and persisting taxa in these ecosystems. Conclusions These findings highlight the importance of resource utilization and resistance to viral predation as key traits that enable specific microbial taxa to persist across fractured shale ecosystems. We also demonstrate the importance of materials used in the hydraulic fracturing process as both a source of persisting shale microorganisms and organic substrates that likely aid in sustaining the microbial community. Moreover, we showed that different physicochemical conditions (i.e., salinity, temperature) can influence the composition and functional potential of persisting microbial communities in shale ecosystems. Together, these results expand our knowledge of microbial life in deep subsurface shales and have important ramifications for management and treatment of microbial biomass in hydraulically fractured wells. 
    more » « less