skip to main content


Title: MetaPop: a pipeline for macro- and microdiversity analyses and visualization of microbial and viral metagenome-derived populations
Abstract Background Microbes and their viruses are hidden engines driving Earth’s ecosystems from the oceans and soils to humans and bioreactors. Though gene marker approaches can now be complemented by genome-resolved studies of inter-(macrodiversity) and intra-(microdiversity) population variation, analytical tools to do so remain scattered or under-developed. Results Here, we introduce MetaPop, an open-source bioinformatic pipeline that provides a single interface to analyze and visualize microbial and viral community metagenomes at both the macro - and microdiversity levels. Macrodiversity estimates include population abundances and α- and β-diversity. Microdiversity calculations include identification of single nucleotide polymorphisms, novel codon-constrained linkage of SNPs, nucleotide diversity ( π and θ ), and selective pressures (pN/pS and Tajima’s D ) within and fixation indices ( F ST ) between populations. MetaPop will also identify genes with distinct codon usage. Following rigorous validation, we applied MetaPop to the gut viromes of autistic children that underwent fecal microbiota transfers and their neurotypical peers. The macrodiversity results confirmed our prior findings for viral populations (microbial shotgun metagenomes were not available) that diversity did not significantly differ between autistic and neurotypical children. However, by also quantifying microdiversity, MetaPop revealed lower average viral nucleotide diversity ( π ) in autistic children. Analysis of the percentage of genomes detected under positive selection was also lower among autistic children, suggesting that higher viral π in neurotypical children may be beneficial because it allows populations to better “bet hedge” in changing environments. Further, comparisons of microdiversity pre- and post-FMT in autistic children revealed that the delivery FMT method (oral versus rectal) may influence viral activity and engraftment of microdiverse viral populations, with children who received their FMT rectally having higher microdiversity post-FMT. Overall, these results show that analyses at the macro level alone can miss important biological differences. Conclusions These findings suggest that standardized population and genetic variation analyses will be invaluable for maximizing biological inference, and MetaPop provides a convenient tool package to explore the dual impact of macro - and microdiversity across microbial communities.  more » « less
Award ID(s):
1759831 1759874
NSF-PAR ID:
10354254
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Microbiome
Volume:
10
Issue:
1
ISSN:
2049-2618
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Coexisting microbial cells of the same species often exhibit genetic variation that can affect phenotypes ranging from nutrient preference to pathogenicity. Here we present inStrain, a program that uses metagenomic paired reads to profile intra-population genetic diversity (microdiversity) across whole genomes and compares microbial populations in a microdiversity-aware man- ner, greatly increasing the accuracy of genomic comparisons when benchmarked against existing methods. We use inStrain to profile >1,000 fecal metagenomes from newborn premature infants and find that siblings share significantly more strains than unrelated infants, although identical twins share no more strains than fraternal siblings. Infants born by cesarean section har- bor Klebsiella with significantly higher nucleotide diversity than infants delivered vaginally, potentially reflecting acquisition from hospital rather than maternal microbiomes. Genomic loci that show diversity in individual infants include variants found between other infants, possibly reflecting inoculation from diverse hospital-associated sources. inStrain can be applied to any metagenomic dataset for microdiversity analysis and rigorous strain comparison. 
    more » « less
  2. Abstract

    What a strain is and how many strains make up a natural bacterial population remain elusive concepts despite their apparent importance for assessing the role of intra-population diversity in disease emergence or response to environmental perturbations. To advance these concepts, we sequenced 138 randomly selectedSalinibacter ruberisolates from two solar salterns and assessed these genomes against companion short-read metagenomes from the same samples. The distribution of genome-aggregate average nucleotide identity (ANI) values among these isolates revealed a bimodal distribution, with four-fold lower occurrence of values between 99.2% and 99.8% relative to ANI >99.8% or <99.2%, revealing a natural “gap” in the sequence space within species. Accordingly, we used this ANI gap to define genomovars and a higher ANI value of >99.99% and shared gene-content >99.0% to define strains. Using these thresholds and extrapolating from how many metagenomic reads each genomovar uniquely recruited, we estimated that –although our 138 isolates represented about 80% of theSal. ruberpopulation– the total population in one saltern pond is composed of 5,500 to 11,000 genomovars, the great majority of which appear to be rare in-situ. These data also revealed that the most frequently recovered isolate in lab media was often not the most abundant genomovar in-situ, suggesting that cultivation biases are significant, even in cases that cultivation procedures are thought to be robust. The methodology and ANI thresholds outlined here should represent a useful guide for future microdiversity surveys of additional microbial species.

     
    more » « less
  3. Summary

    Oxygen minimum zones (OMZs) are critical to marine nitrogen cycling and global climate change. While OMZ microbial communities are relatively well‐studied, little is known about their viruses. Here, we assess the viral community ecology of 22 deeply sequenced viral metagenomes along a gradient of oxygenated to anoxic waters (<0.02 μmol/l O2) in the Eastern Tropical South Pacific (ETSP) OMZ. We identified 46 127 viral populations (≥5 kb), which augments the known viruses from ETSP by 10‐fold. Viral communities clustered into six groups that correspond to oceanographic features. Oxygen concentration was the predominant environmental feature driving viral community structure. Alpha and beta diversity of viral communities in the anoxic zone were lower than in surface waters, which parallels the low microbial diversity seen in other studies. ETSP viruses were largely endemic, with the majority of shared viruses (87%) also present in other OMZ samples. We detected 543 putative viral‐encoded auxiliary metabolic genes (AMGs), of which some have a distribution that reflects physico‐chemical characteristics across depth. Together these findings provide an ecological baseline for viral community structure, drivers and population variability in OMZs that will help future studies assess the role of viruses in these climate‐critical environments.

     
    more » « less
  4. null (Ed.)
    The extent and ecological significance of intraspecific diversity within marine microbial populations is still poorly understood, and it remains unclear if such strain-level microdiversity will affect fitness and persistence in a rapidly changing ocean environment. In this study, we cultured 11 sympatric strains of the ubiquitous marine picocyanobacterium Synechococcus isolated from a Narragansett Bay (Rhode Island, USA) phytoplankton community thermal selection experiment. Despite all 11 isolates being highly similar (with average nucleotide identities of >99.9%, with 98.6-100% of the genome aligning), thermal performance curves revealed selection at warm and cool temperatures had subdivided the initial population into thermotypes with pronounced differences in maximum growth temperatures. Within the fine-scale genetic diversity that did exist within this population, the two divergent thermal ecotypes differed at a locus containing genes for the phycobilisome antenna complex. Our study demonstrates that present-day marine microbial populations can contain microdiversity in the form of cryptic but environmentally-relevant thermotypes that may increase their resilience to future rising temperatures. 
    more » « less
  5. The extent and ecological significance of intraspecific functional diversity within marine microbial populations is still poorly understood, and it remains unclear if such strain-level microdiversity will affect fitness and persistence in a rapidly changing ocean environment. In this study, we cultured 11 sympatric strains of the ubiquitous marine picocyanobacteriumSynechococcusisolated from a Narragansett Bay (RI) phytoplankton community thermal selection experiment. Thermal performance curves revealed selection at cool and warm temperatures had subdivided the initial population into thermotypes with pronounced differences in maximum growth temperatures. Curiously, the genomes of all 11 isolates were almost identical (average nucleotide identities of >99.99%, with >99% of the genome aligning) and no differences in gene content or single nucleotide variants were associated with either cool or warm temperature phenotypes. Despite a very high level of genomic similarity, sequenced epigenomes for two strains showed differences in methylation on genes associated with photosynthesis. These corresponded to measured differences in photophysiology, suggesting a potential pathway for future mechanistic research into thermal microdiversity. Our study demonstrates that present-day marine microbial populations can harbor cryptic but environmentally relevant thermotypes which may increase their resilience to future rising temperatures.

     
    more » « less