skip to main content


Search for: All records

Award ID contains: 1759831

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Jouline, Igor B (Ed.)
    ABSTRACT

    Large-scale surveys of prokaryotic communities (metagenomes), as well as isolate genomes, have revealed that their diversity is predominantly organized in sequence-discrete units that may be equated to species. Specifically, genomes of the same species commonly show genome-aggregate average nucleotide identity (ANI) >95% among themselves and ANI <90% to members of other species, while genomes showing ANI 90%–95% are comparatively rare. However, it remains unclear if such “discontinuities” or gaps in ANI values can be observed within species and thus used to advance and standardize intra-species units. By analyzing 18,123 complete isolate genomes from 330 bacterial species with at least 10 genome representatives each and available long-read metagenomes, we show that another discontinuity exists between 99.2% and 99.8% (midpoint 99.5%) ANI in most of these species. The 99.5% ANI threshold is largely consistent with how sequence types have been defined in previous epidemiological studies but provides clusters with ~20% higher accuracy in terms of evolutionary and gene-content relatedness of the grouped genomes, while strains should be consequently defined at higher ANI values (>99.99% proposed). Collectively, our results should facilitate future micro-diversity studies across clinical or environmental settings because they provide a more natural definition of intra-species units of diversity.

    IMPORTANCE

    Bacterial strains and clonal complexes are two cornerstone concepts for microbiology that remain loosely defined, which confuses communication and research. Here we identify a natural gap in genome sequence comparisons among isolate genomes of all well-sequenced species that has gone unnoticed so far and could be used to more accurately and precisely define these and related concepts compared to current methods. These findings advance the molecular toolbox for accurately delineating and following the important units of diversity within prokaryotic species and thus should greatly facilitate future epidemiological and micro-diversity studies across clinical and environmental settings.

     
    more » « less
    Free, publicly-accessible full text available January 16, 2025
  2. Abstract

    Whether prokaryotes, and other microorganisms, form distinct clusters that can be recognized as species remains an issue of paramount theoretical as well as practical consequence in identifying, regulating, and communicating about these organisms. In the past decade, comparisons of thousands of genomes of isolates and hundreds of metagenomes have shown that prokaryotic diversity may be predominantly organized in such sequence‐discrete clusters, albeit organisms of intermediate relatedness between the identified clusters are also frequently found. Accumulating evidence suggests, however, that the latter “intermediate” organisms show enough ecological and/or functional distinctiveness to be considered different species. Notably, the area of discontinuity between clusters often—but not always—appears to be around 85%–95% genome‐average nucleotide identity, consistently among different taxa. More recent studies have revealed remarkably similar diversity patterns for viruses and microbial eukaryotes as well. This high consistency across taxa implies a specific mechanistic process that underlies the maintenance of the clusters. The underlying mechanism may be a substantial reduction in the efficiency of homologous recombination, which mediates (successful) horizontal gene transfer, around 95% nucleotide identity. Deviations from the 95% threshold (e.g., species showing lower intraspecies diversity) may be caused by ecological differentiation that imposes barriers to otherwise frequent gene transfer. While this hypothesis that clusters are driven by ecological differentiation coupled to recombination frequency (i.e., higher recombination within vs. between groups) is appealing, the supporting evidence remains anecdotal. The data needed to rigorously test the hypothesis toward advancing the species concept are also outlined.

     
    more » « less
    Free, publicly-accessible full text available December 1, 2024
  3. Abstract

    Metagenomic surveys have revealed that natural microbial communities are predominantly composed of sequence-discrete, species-like populations but the genetic and/or ecological processes that maintain such populations remain speculative, limiting our understanding of population speciation and adaptation to perturbations. To address this knowledge gap, we sequenced 112 Salinibacter ruber isolates and 12 companion metagenomes from four adjacent saltern ponds in Mallorca, Spain that were experimentally manipulated to dramatically alter salinity and light intensity, the two major drivers of this ecosystem. Our analyses showed that the pangenome of the local Sal. ruber population is open and similar in size (~15,000 genes) to that of randomly sampled Escherichia coli genomes. While most of the accessory (noncore) genes were isolate-specific and showed low in situ abundances based on the metagenomes compared to the core genes, indicating that they were functionally unimportant and/or transient, 3.5% of them became abundant when salinity (but not light) conditions changed and encoded for functions related to osmoregulation. Nonetheless, the ecological advantage of these genes, while significant, was apparently not strong enough to purge diversity within the population. Collectively, our results provide an explanation for how this immense intrapopulation gene diversity is maintained, which has implications for the prokaryotic species concept.

     
    more » « less
  4. Summary

    Recent advances in sequencing technology and bioinformatic pipelines have allowed unprecedented access to the genomes of yet‐uncultivated microorganisms from diverse environments. However, the catalogue of freshwater genomes remains limited, and most genome recovery attempts in freshwater ecosystems have only targeted specific taxa. Here, we present a genome recovery pipeline incorporating iterative subtractive binning, and apply it to a time series of 100 metagenomic datasets from seven connected lakes and estuaries along the Chattahoochee River (Southeastern USA). Our set of metagenome‐assembled genomes (MAGs) represents >400 yet‐unnamed genomospecies, substantially increasing the number of high‐quality MAGs from freshwater lakes. We propose names for two novel species: ‘CandidatusElulimicrobium humile’ (‘Ca. Elulimicrobiota’, ‘Patescibacteria’) and ‘CandidatusAquidulcis frankliniae’ (‘Chloroflexi’). Collectively, our MAGs represented about half of the total microbial community at any sampling point. To evaluate the prevalence of these genomospecies in the chronoseries, we introduce methodologies to estimate relative abundance and habitat preference that control for uneven genome quality and sample representation. We demonstrate high degrees of habitat‐specialization and endemicity for most genomospecies in the Chattahoochee lakes. Wider ecological ranges characterized smaller genomes with higher coding densities, indicating an overall advantage of smaller, more compact genomes for cosmopolitan distributions.

     
    more » « less
  5. Summary

    Bacteriophages encode host‐acquired functional genes known as auxiliary metabolic genes (AMGs). Photosynthesis AMGs are commonly found in marine cyanobacteria‐infectingMyoviridaeandPodoviridaecyanophages, but their ecology remains understudied in freshwater environments. To advance knowledge of this issue, we analysed viral metagenomes collected in the summertime for four years from five lakes and two estuarine locations interconnected by the Chattahoochee River, Southeast USA. Sequences representing ten different AMGs were recovered and found to be prevalent in all sites. Most freshwater AMGs were 10‐fold less abundant than estuarine and marine AMGs and were encoded by novelMyoviridaeandPodoviridaecyanophage genera. Notably, several of the corresponding viral genomes showed endemism to a specific province along the river. This translated intopsbAgene phylogenetic clustering patterns that matched a marine vs. freshwater origin indicating thatpsbAmay serve as a robust classification and source‐tracking biomarker. Genomes classified in a novel viral lineage represented by isolate S‐EIVl containedpsbA, which is unprecedented for this lineage. Collectively, our findings indicated that the acquisition of photosynthesis AMGs is a widespread strategy used by cyanophages in aquatic ecosystems, and further indicated the existence of viral provinces in which certain viral species and/or genotypes are locally abundant.

     
    more » « less
  6. Free, publicly-accessible full text available January 1, 2025
  7. Free, publicly-accessible full text available November 11, 2024
  8. Abstract Background Microbes and their viruses are hidden engines driving Earth’s ecosystems from the oceans and soils to humans and bioreactors. Though gene marker approaches can now be complemented by genome-resolved studies of inter-(macrodiversity) and intra-(microdiversity) population variation, analytical tools to do so remain scattered or under-developed. Results Here, we introduce MetaPop, an open-source bioinformatic pipeline that provides a single interface to analyze and visualize microbial and viral community metagenomes at both the macro - and microdiversity levels. Macrodiversity estimates include population abundances and α- and β-diversity. Microdiversity calculations include identification of single nucleotide polymorphisms, novel codon-constrained linkage of SNPs, nucleotide diversity ( π and θ ), and selective pressures (pN/pS and Tajima’s D ) within and fixation indices ( F ST ) between populations. MetaPop will also identify genes with distinct codon usage. Following rigorous validation, we applied MetaPop to the gut viromes of autistic children that underwent fecal microbiota transfers and their neurotypical peers. The macrodiversity results confirmed our prior findings for viral populations (microbial shotgun metagenomes were not available) that diversity did not significantly differ between autistic and neurotypical children. However, by also quantifying microdiversity, MetaPop revealed lower average viral nucleotide diversity ( π ) in autistic children. Analysis of the percentage of genomes detected under positive selection was also lower among autistic children, suggesting that higher viral π in neurotypical children may be beneficial because it allows populations to better “bet hedge” in changing environments. Further, comparisons of microdiversity pre- and post-FMT in autistic children revealed that the delivery FMT method (oral versus rectal) may influence viral activity and engraftment of microdiverse viral populations, with children who received their FMT rectally having higher microdiversity post-FMT. Overall, these results show that analyses at the macro level alone can miss important biological differences. Conclusions These findings suggest that standardized population and genetic variation analyses will be invaluable for maximizing biological inference, and MetaPop provides a convenient tool package to explore the dual impact of macro - and microdiversity across microbial communities. 
    more » « less