skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Search for: All records

Creators/Authors contains: "Dupont, Chris L."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Background

    With the advent of metagenomics, the importance of microorganisms and how their interactions are relevant to ecosystem resilience, sustainability, and human health has become evident. Cataloging and preserving biodiversity is paramount not only for the Earth’s natural systems but also for discovering solutions to challenges that we face as a growing civilization. Metagenomics pertains to the in silico study of all microorganisms within an ecological community in situ,however, many software suites recover only prokaryotes and have limited to no support for viruses and eukaryotes.


    In this study, we introduce theViral Eukaryotic Bacterial Archaeal(VEBA) open-source software suite developed to recover genomes from all domains. To our knowledge,VEBAis the first end-to-end metagenomics suite that can directly recover, quality assess, and classify prokaryotic, eukaryotic, and viral genomes from metagenomes.VEBAimplements a novel iterative binning procedure and hybrid sample-specific/multi-sample framework that yields more genomes than any existing methodology alone.VEBAincludes a consensus microeukaryotic database containing proteins from existing databases to optimize microeukaryotic gene modeling and taxonomic classification.VEBAalso provides a unique clustering-based dereplication strategy allowing for sample-specific genomes and genes to be directly compared across non-overlapping biological samples. Finally,VEBAis the only pipeline that automates the detection of candidate phyla radiation bacteria and implements the appropriate genome quality assessments.VEBA’s capabilities are demonstrated by reanalyzing 3 existing public datasets which recovered a total of 948 MAGs (458 prokaryotic, 8 eukaryotic, and 482 viral) including several uncharacterized organisms and organisms with no public genome representatives.


    TheVEBAsoftware suite allows for the in silico recovery of microorganisms from all domains of life by integrating cutting edge algorithms in novel ways.VEBAfully integrates both end-to-end and task-specific metagenomic analysis in a modular architecture that minimizes dependencies and maximizes productivity. The contributions ofVEBAto the metagenomics community includes seamless end-to-end metagenomics analysis but also provides users with the flexibility to perform specific analytical tasks.VEBAallows for the automation of several metagenomics steps and shows that new information can be recovered from existing datasets.

    more » « less
  2. Enzymes catalyze key reactions within Earth’s life-sustaining biogeochemical cycles. Here, we use metaproteomics to examine the enzymatic capabilities of the microbial community (0.2 to 3 µm) along a 5,000-km-long, 1-km-deep transect in the central Pacific Ocean. Eighty-five percent of total protein abundance was of bacterial origin, with Archaea contributing 1.6%. Over 2,000 functional KEGG Ontology (KO) groups were identified, yet only 25 KO groups contributed over half of the protein abundance, simultaneously indicating abundant key functions and a long tail of diverse functions. Vertical attenuation of individual proteins displayed stratification of nutrient transport, carbon utilization, and environmental stress. The microbial community also varied along horizontal scales, shaped by environmental features specific to the oligotrophic North Pacific Subtropical Gyre, the oxygen-depleted Eastern Tropical North Pacific, and nutrient-rich equatorial upwelling. Some of the most abundant proteins were associated with nitrification and C1 metabolisms, with observed interactions between these pathways. The oxidoreductases nitrite oxidoreductase (NxrAB), nitrite reductase (NirK), ammonia monooxygenase (AmoABC), manganese oxidase (MnxG), formate dehydrogenase (FdoGH and FDH), and carbon monoxide dehydrogenase (CoxLM) displayed distributions indicative of biogeochemical status such as oxidative or nutritional stress, with the potential to be more sensitive than chemical sensors. Enzymes that mediate transformations of atmospheric gases like CO, CO 2 , NO, methanethiol, and methylamines were most abundant in the upwelling region. We identified hot spots of biochemical transformation in the central Pacific Ocean, highlighted previously understudied metabolic pathways in the environment, and provided rich empirical data for biogeochemical models critical for forecasting ecosystem response to climate change. 
    more » « less
  3. null (Ed.)
    Marine microeukaryotes play a fundamental role in biogeochemical cycling through the transfer of energy to higher trophic levels and vertical carbon transport. Despite their global importance, microeukaryote physiology, nutrient metabolism and contributions to carbon cycling across offshore ecosystems are poorly characterized. Here, we observed the prevalence of dinoflagellates along a 4,600-km meridional transect extending across the central Pacific Ocean, where oligotrophic gyres meet equatorial upwelling waters rich in macronutrients yet low in dissolved iron. A combined multi-omics and geochemical analysis provided a window into dinoflagellate metabolism across the transect, indicating a continuous taxonomic dinoflagellate community that shifted its functional transcriptome and proteome as it extended from the euphotic to the mesopelagic zone. In euphotic waters, multi-omics data suggested that a combination of trophic modes were utilized, while mesopelagic metabolism was marked by cytoskeletal investments and nutrient recycling. Rearrangement in nutrient metabolism was evident in response to variable nitrogen and iron regimes across the gradient, with no associated change in community assemblage. Total dinoflagellate proteins scaled with particulate carbon export, with both elevated in equatorial waters, suggesting a link between dinoflagellate abundance and total carbon flux. Dinoflagellates employ numerous metabolic strategies that enable broad occupation of central Pacific ecosystems and play a dual role in carbon transformation through both photosynthetic fixation in the euphotic zone and remineralization in the mesopelagic zone. 
    more » « less
  4. Abstract

    SAR86 is an abundant and ubiquitous heterotroph in the surface ocean that plays a central role in the function of marine ecosystems. We hypothesized that despite its ubiquity, different SAR86 subgroups may be endemic to specific ocean regions and functionally specialized for unique marine environments. However, the global biogeographical distributions of SAR86 genes, and the manner in which these distributions correlate with marine environments, have not been investigated. We quantified SAR86 gene content across globally distributed metagenomic samples and modeled these gene distributions as a function of 51 environmental variables. We identified five distinct clusters of genes within the SAR86 pangenome, each with a unique geographic distribution associated with specific environmental characteristics. Gene clusters are characterized by the strong taxonomic enrichment of distinct SAR86 genomes and partial assemblies, as well as differential enrichment of certain functional groups, suggesting differing functional and ecological roles of SAR86 ecotypes. We then leveraged our models and high-resolution, remote sensing-derived environmental data to predict the distributions of SAR86 gene clusters across the world’s oceans, creating global maps of SAR86 ecotype distributions. Our results reveal that SAR86 exhibits previously unknown, complex biogeography, and provide a framework for exploring geographic distributions of genetic diversity from other microbial clades.

    more » « less
  5. Summary

    Next‐generation sequencing technologies have generated, and continue to produce, an increasingly large corpus of biological data. The data generated are inherently compositional as they convey only relative information dependent upon the capacity of the instrument, experimental design and technical bias. There is considerable information to be gained through network analysis by studying the interactions between components within a system. Network theory methods using compositional data are powerful approaches for quantifying relationships between biological components and their relevance to phenotype, environmental conditions or other external variables. However, many of the statistical assumptions used for network analysis are not designed for compositional data and can bias downstream results. In this mini‐review, we illustrate the utility of network theory in biological systems and investigate modern techniques while introducing researchers to frameworks for implementation. We overview (1) compositional data analysis, (2) data transformations and (3) network theory along with insight on a battery of network types including static‐, temporal‐, sample‐specific‐ and differential‐networks. The intention of this mini‐review is not to provide a comprehensive overview of network methods, rather to introduce microbiology researchers to (semi)‐unsupervised data‐driven approaches for inferring latent structures that may give insight into biological phenomena or abstract mechanics of complex systems.

    more » « less
  6. Vast and diverse microbial communities exist within the ocean. To better understand the global influence of these microorganisms on Earth’s climate, we developed a robot capable of sampling dissolved and particulate seawater biochemistry across ocean basins while still capturing the fine-scale biogeochemical processes therein. Carbon and other nutrients are acquired and released by marine microorganisms as they build and break down organic matter. The scale of the ocean makes these processes globally relevant and, at the same time, challenging to fully characterize. Microbial community composition and ocean biochemistry vary across multiple physical scales up to that of the ocean basins. Other autonomous underwater vehicles are optimized for moving continuously and, primarily, horizontally through the ocean. In contrast,Clio, the robot that we describe, is designed to efficiently and precisely move vertically through the ocean, drift laterally in a Lagrangian manner to better observe water masses, and integrate with research vessel operations to map large horizontal scales to a depth of 6000 meters. We present results that show howClioconducts high-resolution sensor surveys and sample return missions, including a mapping of 1144 kilometers of the Sargasso Sea to a depth of 1000 meters. We further show how the samples obtain filtered biomass from seawater that enable genomic and proteomic measurements not possible through in situ sensing. These results demonstrate a robotic oceanography approach for global-scale surveys of ocean biochemistry.

    more » « less