skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1924492

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Surveys of microbial communities (metagenomics) or isolate genomes have revealed sequence-discrete species. That is, members of the same species show >95% average nucleotide identity (ANI) of shared genes among themselves vs. <83% ANI to members of other species while genome pairs showing between 83% and 95% ANI are comparatively rare. In these surveys, aquatic bacteria of the ubiquitous SAR11 clade (Class Alphaproteobacteria) are an outlier and often do not exhibit discrete species boundaries, suggesting the potential for alternate modes of genetic differentiation. To explore evolution in SAR11, we analyzed high-quality, single-cell amplified genomes, and companion metagenomes from an oxygen minimum zone in the Eastern Tropical Pacific Ocean, where the SAR11 make up ~20% of the total microbial community. Our results show that SAR11 do form several sequence-discrete species, but their ANI range of discreteness is shifted to lower identities between 86% and 91%, with intra-species ANI ranging between 91% and 100%. Measuring recent gene exchange among these genomes based on a recently developed methodology revealed higher frequency of homologous recombination within compared to between species that affects sequence evolution at least twice as much as diversifying point mutation across the genome. Recombination in SAR11 appears to be more promiscuous compared to other prokaryotic species, likely due to the deletion of universal genes involved in the mismatch repair, and has facilitated the spread of adaptive mutations within the species (gene sweeps), further promoting the high intraspecies diversity observed. Collectively, these results implicate rampant, genome-wide homologous recombination as the mechanism of cohesion for distinct SAR11 species. 
    more » « less
  2. Abstract Secondary metabolites play essential roles in ecological interactions and nutrient acquisition, and are of interest for their potential uses in medicine and biotechnology. Genome mining for biosynthetic gene clusters (BGCs) can be used for the discovery of new compounds. Here, we use metagenomics and metatranscriptomics to analyze BGCs in free-living and particle-associated microbial communities through the stratified water column of the Cariaco Basin, Venezuela. We recovered 565 bacterial and archaeal metagenome-assembled genomes (MAGs) and identified 1154 diverse BGCs. We show that differences in water redox potential and microbial lifestyle (particle-associated vs. free-living) are associated with variations in the predicted composition and production of secondary metabolites. Our results indicate that microbes, including understudied clades such as Planctomycetota, potentially produce a wide range of secondary metabolites in these anoxic/euxinic waters. 
    more » « less
  3. Fraser, Claire M. (Ed.)
    ABSTRACT Metagenomics is a powerful method for interpreting the ecological roles and physiological capabilities of mixed microbial communities. Yet, many tools for processing metagenomic data are neither designed to consider eukaryotes nor are they built for an increasing amount of sequence data. EukHeist is an automated pipeline to retrieve eukaryotic and prokaryotic metagenome-assembled genomes (MAGs) from large-scale metagenomic sequence data sets. We developed the EukHeist workflow to specifically process large amounts of both metagenomic and/or metatranscriptomic sequence data in an automated and reproducible fashion. Here, we applied EukHeist to the large-size fraction data (0.8–2,000 µm) from Tara Oceans to recover both eukaryotic and prokaryotic MAGs, which we refer to as TOPAZ (Tara Oceans Particle-Associated MAGs). The TOPAZ MAGs consisted of >900 environmentally relevant eukaryotic MAGs and >4,000 bacterial and archaeal MAGs. The bacterial and archaeal TOPAZ MAGs expand upon the phylogenetic diversity of likely particle- and host-associated taxa. We use these MAGs to demonstrate an approach to infer the putative trophic mode of the recovered eukaryotic MAGs. We also identify ecological cohorts of co-occurring MAGs, which are driven by specific environmental factors and putative host-microbe associations. These data together add to a number of growing resources of environmentally relevant eukaryotic genomic information. Complementary and expanded databases of MAGs, such as those provided through scalable pipelines like EukHeist, stand to advance our understanding of eukaryotic diversity through increased coverage of genomic representatives across the tree of life. IMPORTANCESingle-celled eukaryotes play ecologically significant roles in the marine environment, yet fundamental questions about their biodiversity, ecological function, and interactions remain. Environmental sequencing enables researchers to document naturally occurring protistan communities, without culturing bias, yet metagenomic and metatranscriptomic sequencing approaches cannot separate individual species from communities. To more completely capture the genomic content of mixed protistan populations, we can create bins of sequences that represent the same organism (metagenome-assembled genomes [MAGs]). We developed the EukHeist pipeline, which automates the binning of population-level eukaryotic and prokaryotic genomes from metagenomic reads. We show exciting insight into what protistan communities are present and their trophic roles in the ocean. Scalable computational tools, like EukHeist, may accelerate the identification of meaningful genetic signatures from large data sets and complement researchers’ efforts to leverage MAG databases for addressing ecological questions, resolving evolutionary relationships, and discovering potentially novel biodiversity. 
    more » « less
  4. The reconstruction of complete microbial metabolic pathways using ‘omics data from environmental samples remains challenging. Computational pipelines for pathway reconstruction that utilize machine learning methods to predict the presence or absence of KEGG modules in incomplete genomes are lacking. Here, we present MetaPathPredict, a software tool that incorporates machine learning models to predict the presence of complete KEGG modules within bacterial genomic datasets. Using gene annotation data and information from the KEGG module database, MetaPathPredict employs deep learning models to predict the presence of KEGG modules in a genome. MetaPathPredict can be used as a command line tool or as a Python module, and both options are designed to be run locally or on a compute cluster. Benchmarks show that MetaPathPredict makes robust predictions of KEGG module presence within highly incomplete genomes. 
    more » « less
  5. Plasmids are mobile genetic elements known to carry secondary metabolic genes that affect the fitness and survival of microbes in the environment. Well-studied cases of plasmid-encoded secondary metabolic genes in marine habitats include toxin/antitoxin and antibiotic biosynthesis/resistance genes. Here, we examine metagenome-assembled genomes (MAGs) from the permanently-stratified water column of the Cariaco Basin for integrated plasmids that encode biosynthetic gene clusters of secondary metabolites (smBGCs). We identify 16 plasmid-borne smBGCs in MAGs associated primarily with Planctomycetota and Pseudomonadota that encode terpene-synthesizing genes, and genes for production of ribosomal and non-ribosomal peptides. These identified genes encode for secondary metabolites that are mainly antimicrobial agents, and hence, their uptake via plasmids may increase the competitive advantage of those host taxa that acquire them. The ecological and evolutionary significance of smBGCs carried by prokaryotes in oxygen-depleted water columns is yet to be fully elucidated. 
    more » « less