skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 17, 2026

Title: A periodic table of bacteria?: Mapping bacterial diversity in trait space
Abstract Bacterial diversity can be overwhelming. There is an ever-expanding number of bacterial taxa being discovered, but many of these taxa remain uncharacterized with unknown traits and environmental preferences. This diversity makes it challenging to interpret ecological patterns in microbiomes and understand why individual taxa, or assemblages, may vary across space and time. While we can use information from the rapidly growing databases of bacterial genomes to infer traits, we still need an approach to organize what we know, or think we know, about bacterial taxa to match taxonomic and phylogenetic information to trait inferences. Inspired by the periodic table of the elements, we have constructed a ‘periodic table’ of bacterial taxa to organize and visualize monophyletic groups of bacteria based on the distributions of key traits predicted from genomic data. By analyzing 50,745 genomes across 31 bacterial phyla, we used the Haar-like wavelet transformation, a model-free transformation of trait data, to identify clades of bacteria which are nearly uniform with respect to six selected traits - oxygen tolerance, autotrophy, chlorophototrophy, maximum potential growth rate, GC content and genome size. The identified functionally uniform clades of bacteria are presented in a concise ‘periodic table’-like format to facilitate identification and exploration of bacterial lineages in trait space. While our approach could be improved and expanded in the future, we demonstrate its utility for integrating phylogenetic information with genome-derived trait values to improve our understanding of the bacterial diversity found in environmental and host-associated microbiomes.  more » « less
Award ID(s):
2126106
PAR ID:
10650420
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Although microorganisms are known to dominate Earth’s biospheres and drive biogeochemical cycling, little is known about the geographic distributions of microbial populations or the environmental factors that pattern those distributions. We used a global-level hierarchical sampling scheme to comprehensively characterize the evolutionary relationships and distributional limitations of the nitrogen-fixing bacterial symbionts of the crop chickpea, generating 1,027 draft whole-genome sequences at the level of bacterial populations, including 14 high-quality PacBio genomes from a phylogenetically representative subset. We find that diverse Mesorhizobium taxa perform symbiosis with chickpea and have largely overlapping global distributions. However, sampled locations cluster based on the phylogenetic diversity of Mesorhizobium populations, and diversity clusters correspond to edaphic and environmental factors, primarily soil type and latitude. Despite long-standing evolutionary divergence and geographic isolation, the diverse taxa observed to nodulate chickpea share a set of integrative conjugative elements (ICEs) that encode the major functions of the symbiosis. This symbiosis ICE takes 2 forms in the bacterial chromosome—tripartite and monopartite—with tripartite ICEs confined to a broadly distributed superspecies clade. The pairwise evolutionary relatedness of these elements is controlled as much by geographic distance as by the evolutionary relatedness of the background genome. In contrast, diversity in the broader gene content of Mesorhizobium genomes follows a tight linear relationship with core genome phylogenetic distance, with little detectable effect of geography. These results illustrate how geography and demography can operate differentially on the evolution of bacterial genomes and offer useful insights for the development of improved technologies for sustainable agriculture. 
    more » « less
  2. null (Ed.)
    Abstract The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth’s continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes. 
    more » « less
  3. Abstract Genomic information is now available for a broad diversity of bacteria, including uncultivated taxa. However, we have corresponding knowledge on environmental preferences (i.e. bacterial growth responses across gradients in oxygen, pH, temperature, salinity, and other environmental conditions) for a relatively narrow swath of bacterial diversity. These limits to our understanding of bacterial ecologies constrain our ability to predict how assemblages will shift in response to global change factors, design effective probiotics, or guide cultivation efforts. We need innovative approaches that take advantage of expanding genome databases to accurately infer the environmental preferences of bacteria and validate the accuracy of these inferences. By doing so, we can broaden our quantitative understanding of the environmental preferences of the majority of bacterial taxa that remain uncharacterized. With this perspective, we highlight why it is important to infer environmental preferences from genomic information and discuss the range of potential strategies for doing so. In particular, we highlight concrete examples of how both cultivation-independent and cultivation-dependent approaches can be integrated with genomic data to develop predictive models. We also emphasize the limitations and pitfalls of these approaches and the specific knowledge gaps that need to be addressed to successfully expand our understanding of the environmental preferences of bacteria. 
    more » « less
  4. Abstract Background Tropical members of the sponge genus Ircinia possess highly complex microbiomes that perform a broad spectrum of chemical processes that influence host fitness. Despite the pervasive role of microbiomes in Ircinia biology, it is still unknown how they remain in stable association across tropical species. To address this question, we performed a comparative analysis of the microbiomes of 11 Ircinia species using whole-metagenomic shotgun sequencing data to investigate three aspects of bacterial symbiont genomes—the redundancy in metabolic pathways across taxa, the evolution of genes involved in pathogenesis, and the nature of selection acting on genes relevant to secondary metabolism. Results A total of 424 new, high-quality bacterial metagenome-assembled genomes (MAGs) were produced for 10 Caribbean Ircinia species, which were evaluated alongside 113 publicly available MAGs sourced from the Pacific species Ircinia ramosa . Evidence of redundancy was discovered in that the core genes of several primary metabolic pathways could be found in the genomes of multiple bacterial taxa. Across hosts, the metagenomes were depleted in genes relevant to pathogenicity and enriched in eukaryotic-like proteins (ELPs) that likely mimic the hosts’ molecular patterning. Finally, clusters of steroid biosynthesis genes (CSGs), which appear to be under purifying selection and undergo horizontal gene transfer, were found to be a defining feature of Ircinia metagenomes. Conclusions These results illustrate patterns of genome evolution within highly complex microbiomes that illuminate how associations with hosts are maintained. The metabolic redundancy within the microbiomes could help buffer the hosts from changes in the ambient chemical and physical regimes and from fluctuations in the population sizes of the individual microbial strains that make up the microbiome. Additionally, the enrichment of ELPs and depletion of LPS and cellular motility genes provide a model for how alternative strategies to virulence can evolve in microbiomes undergoing mixed-mode transmission that do not ultimately result in higher levels of damage (i.e., pathogenicity) to the host. Our last set of results provides evidence that sterol biosynthesis in Ircinia -associated bacteria is widespread and that these molecules are important for the survival of bacteria in highly complex Ircinia microbiomes. 
    more » « less
  5. Abstract Flagellar motility is a key bacterial trait as it allows bacteria to navigate their immediate surroundings. Not all bacteria are capable of flagellar motility, and the distribution of this trait, its ecological associations, and the life history strategies of flagellated taxa remain poorly characterized. We developed and validated a genome-based approach to infer the potential for flagellar motility across 12 bacterial phyla (26 192 unique genomes). The capacity for flagellar motility was associated with a higher prevalence of genes for carbohydrate metabolism and higher maximum potential growth rates, suggesting that flagellar motility is more prevalent in environments with higher carbon availability. To test this hypothesis, we applied a method to infer the prevalence of flagellar motility in whole bacterial communities from metagenomic data and quantified the prevalence of flagellar motility across four independent field studies that each captured putative gradients in soil carbon availability (148 metagenomes). We observed a positive relationship between the prevalence of bacterial flagellar motility and soil carbon availability in all datasets. Since soil carbon availability is often correlated with other factors that could influence the prevalence of flagellar motility, we validated these observations using metagenomic data from a soil incubation experiment where carbon availability was directly manipulated with glucose amendments. This confirmed that the prevalence of bacterial flagellar motility is consistently associated with soil carbon availability over other potential confounding factors. This work highlights the value of combining predictive genomic and metagenomic approaches to expand our understanding of microbial phenotypic traits and reveal their general environmental associations. 
    more » « less