skip to main content

Search for: All records

Creators/Authors contains: "Fuhrman, Jed A."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    The introduction of high-throughput chromosome conformation capture (Hi-C) into metagenomics enables reconstructing high-quality metagenome-assembled genomes (MAGs) from microbial communities. Despite recent advances in recovering eukaryotic, bacterial, and archaeal genomes using Hi-C contact maps, few of Hi-C-based methods are designed to retrieve viral genomes. Here we introduce ViralCC, a publicly available tool to recover complete viral genomes and detect virus-host pairs using Hi-C data. Compared to other Hi-C-based methods, ViralCC leverages the virus-host proximity structure as a complementary information source for the Hi-C interactions. Using mock and real metagenomic Hi-C datasets from several different microbial ecosystems, including the human gut, cow fecal, and wastewater, we demonstrate that ViralCC outperforms existing Hi-C-based binning methods as well as state-of-the-art tools specifically dedicated to metagenomic viral binning. ViralCC can also reveal the taxonomic structure of viruses and virus-host pairs in microbial communities. When applied to a real wastewater metagenomic Hi-C dataset, ViralCC constructs a phage-host network, which is further validated using CRISPR spacer analyses. ViralCC is an open-source pipeline available at

  2. Abstract

    Free-living and particle-associated marine prokaryotes have physiological, genomic, and phylogenetic differences, yet factors influencing their temporal dynamics remain poorly constrained. In this study, we quantify the entire microbial community composition monthly over several years, including viruses, prokaryotes, phytoplankton, and total protists, from the San-Pedro Ocean Time-series using ribosomal RNA sequencing and viral metagenomics. Canonical analyses show that in addition to physicochemical factors, the double-stranded DNA viral community is the strongest factor predicting free-living prokaryotes, explaining 28% of variability, whereas the phytoplankton (via chloroplast 16S rRNA) community is strongest with particle-associated prokaryotes, explaining 31% of variability. Unexpectedly, protist community explains little variability. Our findings suggest that biotic interactions are significant determinants of the temporal dynamics of prokaryotes, and the relative importance of specific interactions varies depending on lifestyles. Also, warming influenced the prokaryotic community, which largely remained oligotrophic summer-like throughout 2014–15, with cyanobacterial populations shifting from cold-water ecotypes to warm-water ecotypes.

  3. Bacteria are single-celled organisms that live out their lives at a microscopic scale. We can find bacteria everywhere we look for them, including inside of our own bodies. Bacteria are incredibly diverse and come in many shapes and sizes. They also vary widely in how they live and grow. Some bacteria grow very quickly and others grow slowly. We wanted to measure the growth of many different types of bacteria in the environment. Unfortunately, some species of bacteria are very difficult to grow in the laboratory. To get around this, we designed a method to predict how fast a type of bacteria can grow, just from its DNA. This way, if we have the DNA of a bacterial species, we can measure its growth even if we cannot get it to grow in our laboratory.
    Free, publicly-accessible full text available June 24, 2023
  4. Abstract Motivation

    Phage–host associations play important roles in microbial communities. But in natural communities, as opposed to culture-based lab studies where phages are discovered and characterized metagenomically, their hosts are generally not known. Several programs have been developed for predicting which phage infects which host based on various sequence similarity measures or machine learning approaches. These are often based on whole viral and host genomes, but in metagenomics-based studies, we rarely have whole genomes but rather must rely on contigs that are sometimes as short as hundreds of bp long. Therefore, we need programs that predict hosts of phage contigs on the basis of these short contigs. Although most existing programs can be applied to metagenomic datasets for these predictions, their accuracies are generally low. Here, we develop ContigNet, a convolutional neural network-based model capable of predicting phage–host matches based on relatively short contigs, and compare it to previously published VirHostMatcher (VHM) and WIsH.


    On the validation set, ContigNet achieves 72–85% area under the receiver operating characteristic curve (AUROC) scores, compared to the maximum of 68% by VHM or WIsH for contigs of lengths between 200 bps to 50 kbps. We also apply the model to the Metagenomic Gut Virus (MGV)more »catalogue, a dataset containing a wide range of draft genomes from metagenomic samples and achieve 60–70% AUROC scores compared to that of VHM and WIsH of 52%. Surprisingly, ContigNet can also be used to predict plasmid-host contig associations with high accuracy, indicating a similar genetic exchange between mobile genetic elements and their hosts.

    Availability and implementation

    The source code of ContigNet and related datasets can be downloaded from

    « less
  5. Abstract

    Community dynamics are central in microbial ecology, yet we lack studies comparing diversity patterns among marine protists and prokaryotes over depth and multiple years. Here, we characterized microbes at the San-Pedro Ocean Time series (2005–2018), using SSU rRNA gene sequencing from two size fractions (0.2–1 and 1–80 μm), with a universal primer set that amplifies from both prokaryotes and eukaryotes, allowing direct comparisons of diversity patterns in a single set of analyses. The 16S + 18S rRNA gene composition in the small size fraction was mostly prokaryotic (>92%) as expected, but the large size fraction unexpectedly contained 46–93% prokaryotic 16S rRNA genes. Prokaryotes and protists showed opposite vertical diversity patterns; prokaryotic diversity peaked at mid-depth, protistan diversity at the surface. Temporal beta-diversity patterns indicated prokaryote communities were much more stable than protists. Although the prokaryotic communities changed monthly, the average community stayed remarkably steady over 14 years, showing high resilience. Additionally, particle-associated prokaryotes were more diverse than smaller free-living ones, especially at deeper depths, contributed unexpectedly by abundant and diverse SAR11 clade II. Eukaryotic diversity was strongly correlated with the diversity of particle-associated prokaryotes but not free-living ones, reflecting that physical associations result in the strongest interactions, including symbioses, parasitism, and decomposermore »relationships.

    « less
  6. Rappe, Michael S. (Ed.)
    ABSTRACT Bacterial biodegradation is a significant contributor to remineralization of polycyclic aromatic hydrocarbons (PAHs)—toxic and recalcitrant components of crude oil as well as by-products of partial combustion chronically introduced into seawater via atmospheric deposition. The Deepwater Horizon oil spill demonstrated the speed at which a seed PAH-degrading community maintained by chronic inputs responds to acute pollution. We investigated the diversity and functional potential of a similar seed community in the chronically polluted Port of Los Angeles (POLA), using stable isotope probing with naphthalene, deep-sequenced metagenomes, and carbon incorporation rate measurements at the port and in two sites in the San Pedro Channel. We demonstrate the ability of the community of degraders at the POLA to incorporate carbon from naphthalene, leading to a quick shift in microbial community composition to be dominated by the normally rare Colwellia and Cycloclasticus . We show that metagenome-assembled genomes (MAGs) belonged to these naphthalene degraders by matching their 16S-rRNA gene with experimental stable isotope probing data. Surprisingly, we did not find a full PAH degradation pathway in those genomes, even when combining genes from the entire microbial community, leading us to hypothesize that promiscuous dehydrogenases replace canonical naphthalene degradation enzymes in this site. We comparedmore »metabolic pathways identified in 29 genomes whose abundance increased in the presence of naphthalene to generate genomic-based recommendations for future optimization of PAH bioremediation at the POLA, e.g., ammonium as opposed to urea, heme or hemoproteins as an iron source, and polar amino acids. IMPORTANCE Oil spills in the marine environment have a devastating effect on marine life and biogeochemical cycles through bioaccumulation of toxic hydrocarbons and oxygen depletion by hydrocarbon-degrading bacteria. Oil-degrading bacteria occur naturally in the ocean, especially where they are supported by chronic inputs of oil or other organic carbon sources, and have a significant role in degradation of oil spills. Polycyclic aromatic hydrocarbons are the most persistent and toxic component of crude oil. Therefore, the bacteria that can break those molecules down are of particular importance. We identified such bacteria at the Port of Los Angeles (POLA), one of the busiest ports worldwide, and characterized their metabolic capabilities. We propose chemical targets based on those analyses to stimulate the activity of these bacteria in case of an oil spill in the Port POLA.« less
  7. Gilbert, Jack A. (Ed.)
    ABSTRACT Small subunit rRNA (SSU rRNA) amplicon sequencing can quantitatively and comprehensively profile natural microbiomes, representing a critically important tool for studying diverse global ecosystems. However, results will only be accurate if PCR primers perfectly match the rRNA of all organisms present. To evaluate how well marine microorganisms across all 3 domains are detected by this method, we compared commonly used primers with >300 million rRNA gene sequences retrieved from globally distributed marine metagenomes. The best-performing primers compared to 16S rRNA of bacteria and archaea were 515Y/926R and 515Y/806RB, which perfectly matched over 96% of all sequences. Considering cyanobacterial and chloroplast 16S rRNA, 515Y/926R had the highest coverage (99%), making this set ideal for quantifying marine primary producers. For eukaryotic 18S rRNA sequences, 515Y/926R also performed best (88%), followed by V4R/V4RB (18S rRNA specific; 82%)—demonstrating that the 515Y/926R combination performs best overall for all 3 domains. Using Atlantic and Pacific Ocean samples, we demonstrate high correspondence between 515Y/926R amplicon abundances (generated for this study) and metagenomic 16S rRNA (median R 2 = 0.98, n  = 272), indicating amplicons can produce equally accurate community composition data compared with shotgun metagenomics. Our analysis also revealed that expected performance of all primer sets could bemore »improved with minor modifications, pointing toward a nearly completely universal primer set that could accurately quantify biogeochemically important taxa in ecosystems ranging from the deep sea to the surface. In addition, our reproducible bioinformatic workflow can guide microbiome researchers studying different ecosystems or human health to similarly improve existing primers and generate more accurate quantitative amplicon data. IMPORTANCE PCR amplification and sequencing of marker genes is a low-cost technique for monitoring prokaryotic and eukaryotic microbial communities across space and time but will work optimally only if environmental organisms match PCR primer sequences exactly. In this study, we evaluated how well primers match globally distributed short-read oceanic metagenomes. Our results demonstrate that primer sets vary widely in performance, and that at least for marine systems, rRNA amplicon data from some primers lack significant biases compared to metagenomes. We also show that it is theoretically possible to create a nearly universal primer set for diverse saline environments by defining a specific mixture of a few dozen oligonucleotides, and present a software pipeline that can guide rational design of primers for any environment with available meta’omic data.« less
  8. Maximal growth rate is a basic parameter of microbial lifestyle that varies over several orders of magnitude, with doubling times ranging from a matter of minutes to multiple days. Growth rates are typically measured using laboratory culture experiments. Yet, we lack sufficient understanding of the physiology of most microbes to design appropriate culture conditions for them, severely limiting our ability to assess the global diversity of microbial growth rates. Genomic estimators of maximal growth rate provide a practical solution to survey the distribution of microbial growth potential, regardless of cultivation status. We developed an improved maximal growth rate estimator and predicted maximal growth rates from over 200,000 genomes, metagenome-assembled genomes, and single-cell amplified genomes to survey growth potential across the range of prokaryotic diversity; extensions allow estimates from 16S rRNA sequences alone as well as weighted community estimates from metagenomes. We compared the growth rates of cultivated and uncultivated organisms to illustrate how culture collections are strongly biased toward organisms capable of rapid growth. Finally, we found that organisms naturally group into two growth classes and observed a bias in growth predictions for extremely slow-growing organisms. These observations ultimately led us to suggest evolutionary definitions of oligotrophy and copiotrophy basedmore »on the selective regime an organism occupies. We found that these growth classes are associated with distinct selective regimes and genomic functional potentials.« less
  9. Free, publicly-accessible full text available December 7, 2023
  10. Abstract

    Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias (CUB) for maximum growth rates and “peak-to-trough ratio” (PTR) for in situ rates. Both were originally tested with pure cultures, but natural populations are more heterogeneous, especially in individual cell histories pertinent to PTR. To test these methods, we compared predictors with observed growth rates of freshly collected marine prokaryotes in unamended seawater. We prefiltered and diluted samples to remove grazers and greatly reduce virus infection, so net growth approximated gross growth. We sampled over 44 h for abundances and metagenomes, generating 101 metagenome-assembled genomes (MAGs), including Actinobacteria, Verrucomicrobia, SAR406, MGII archaea, etc. We tracked each MAG population by cell-abundance-normalized read recruitment, finding growth rates of 0 to 5.99 per day, the first reported rates for several groups, and used these rates as benchmarks. PTR, calculated by three methods, rarely correlated to growth (r~−0.26–0.08), except for rapidly growing γ-Proteobacteria (r~0.63–0.92), while CUB correlated moderately well to observed maximum growth rates (r = 0.57). This suggests that current PTR approaches poorly predict actual growth of most marine bacterial populations, but maximum growth rates can be approximated from genomic characteristics.