skip to main content

Search for: All records

Creators/Authors contains: "Sullivan, Matthew B."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Our knowledge of viral sequence space has exploded with advancing sequencing technologies and large-scale sampling and analytical efforts. Though archaea are important and abundant prokaryotes in many systems, our knowledge of archaeal viruses outside of extreme environments is limited. This largely stems from the lack of a robust, high-throughput, and systematic way to distinguish between bacterial and archaeal viruses in datasets of curated viruses. Here we upgrade our prior text-based tool (MArVD) via training and testing a random forest machine learning algorithm against a newly curated dataset of archaeal viruses. After optimization, MArVD2 presented a significant improvement over its predecessor in terms of scalability, usability, and flexibility, and will allow user-defined custom training datasets as archaeal virus discovery progresses. Benchmarking showed that a model trained with viral sequences from the hypersaline, marine, and hot spring environments correctly classified 85% of the archaeal viruses with a false detection rate below 2% using a random forest prediction threshold of 80% in a separate benchmarking dataset from the same habitats.

    more » « less
  2. Abstract Background Microbes and their viruses are hidden engines driving Earth’s ecosystems from the oceans and soils to humans and bioreactors. Though gene marker approaches can now be complemented by genome-resolved studies of inter-(macrodiversity) and intra-(microdiversity) population variation, analytical tools to do so remain scattered or under-developed. Results Here, we introduce MetaPop, an open-source bioinformatic pipeline that provides a single interface to analyze and visualize microbial and viral community metagenomes at both the macro - and microdiversity levels. Macrodiversity estimates include population abundances and α- and β-diversity. Microdiversity calculations include identification of single nucleotide polymorphisms, novel codon-constrained linkage of SNPs, nucleotide diversity ( π and θ ), and selective pressures (pN/pS and Tajima’s D ) within and fixation indices ( F ST ) between populations. MetaPop will also identify genes with distinct codon usage. Following rigorous validation, we applied MetaPop to the gut viromes of autistic children that underwent fecal microbiota transfers and their neurotypical peers. The macrodiversity results confirmed our prior findings for viral populations (microbial shotgun metagenomes were not available) that diversity did not significantly differ between autistic and neurotypical children. However, by also quantifying microdiversity, MetaPop revealed lower average viral nucleotide diversity ( π ) in autistic children. Analysis of the percentage of genomes detected under positive selection was also lower among autistic children, suggesting that higher viral π in neurotypical children may be beneficial because it allows populations to better “bet hedge” in changing environments. Further, comparisons of microdiversity pre- and post-FMT in autistic children revealed that the delivery FMT method (oral versus rectal) may influence viral activity and engraftment of microdiverse viral populations, with children who received their FMT rectally having higher microdiversity post-FMT. Overall, these results show that analyses at the macro level alone can miss important biological differences. Conclusions These findings suggest that standardized population and genetic variation analyses will be invaluable for maximizing biological inference, and MetaPop provides a convenient tool package to explore the dual impact of macro - and microdiversity across microbial communities. 
    more » « less
  3. Abstract Background

    Microbiomes are now recognized as the main drivers of ecosystem function ranging from the oceans and soils to humans and bioreactors. However, a grand challenge in microbiome science is to characterize and quantify the chemical currencies of organic matter (i.e., metabolites) that microbes respond to and alter. Critical to this has been the development of Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS), which has drastically increased molecular characterization of complex organic matter samples, but challenges users with hundreds of millions of data points where readily available, user-friendly, and customizable software tools are lacking.


    Here, we build on years of analytical experience with diverse sample types to develop MetaboDirect, an open-source, command-line-based pipeline for the analysis (e.g., chemodiversity analysis, multivariate statistics), visualization (e.g., Van Krevelen diagrams, elemental and molecular class composition plots), and presentation of direct injection high-resolution FT-ICR MS data sets after molecular formula assignment has been performed. When compared to other available FT-ICR MS software, MetaboDirect is superior in that it requires a single line of code to launch a fully automated framework for the generation and visualization of a wide range of plots, with minimal coding experience required. Among the tools evaluated, MetaboDirect is also uniquely able to automatically generate biochemical transformation networks (ab initio) based on mass differences (mass difference network-based approach) that provide an experimental assessment of metabolite connections within a given sample or a complex metabolic system, thereby providing important information about the nature of the samples and the set of microbial reactions or pathways that gave rise to them. Finally, for more experienced users, MetaboDirect allows users to customize plots, outputs, and analyses.


    Application of MetaboDirect to FT-ICR MS-based metabolomic data sets from a marine phage-bacterial infection experiment and aSphagnumleachate microbiome incubation experiment showcase the exploration capabilities of the pipeline that will enable the research community to evaluate and interpret their data in greater depth and in less time. It will further advance our knowledge of how microbial communities influence and are influenced by the chemical makeup of the surrounding system. The source code and User’s guide of MetaboDirect are freely available through ( and (, respectively.

    more » « less
  4. Abstract

    The fate of oceanic carbon and nutrients depends on interactions between viruses, prokaryotes, and unicellular eukaryotes (protists) in a highly interconnected planktonic food web. To date, few controlled mechanistic studies of these interactions exist, and where they do, they are largely pairwise, focusing either on viral infection (i.e., virocells) or protist predation. Here we studied population-level responses of Synechococcus cyanobacterial virocells (i.e., cyanovirocells) to the protist Oxyrrhis marina using transcriptomics, endo- and exo-metabolomics, photosynthetic efficiency measurements, and microscopy. Protist presence had no measurable impact on Synechococcus transcripts or endometabolites. The cyanovirocells alone had a smaller intracellular transcriptional and metabolic response than cyanovirocells co-cultured with protists, displaying known patterns of virus-mediated metabolic reprogramming while releasing diverse exometabolites during infection. When protists were added, several exometabolites disappeared, suggesting microbial consumption. In addition, the intracellular cyanovirocell impact was largest, with 4.5- and 10-fold more host transcripts and endometabolites, respectively, responding to protists, especially those involved in resource and energy production. Physiologically, photosynthetic efficiency also increased, and together with the transcriptomics and metabolomics findings suggest that cyanovirocell metabolic demand is highest when protists are present. These data illustrate cyanovirocell responses to protist presence that are not yet considered when linking microbial physiology to global-scale biogeochemical processes.

    more » « less
  5. Abstract

    Viral metagenomics (viromics) has reshaped our understanding of DNA viral diversity, ecology, and evolution across Earth’s ecosystems. However, viromics now needs approaches to link newly discovered viruses to their host cells and characterize them at scale. This study adapts one such method, sequencing-enabled viral tagging (VT), to establish “Viral Tag and Grow” (VT + Grow) to rapidly capture and characterize viruses that infect a cultivated target bacterium, Pseudoalteromonas. First, baseline cytometric and microscopy data improved understanding of how infection conditions and host physiology impact populations in VT flow cytograms. Next, we extensively evaluated “and grow” capability to assess where VT signals reflect adsorption alone or wholly successful infections that lead to lysis. Third, we applied VT + Grow to a clonal virus stock, which, coupled to traditional plaque assays, revealed significant variability in burst size—findings that hint at a viral “individuality” parallel to the microbial phenotypic heterogeneity literature. Finally, we established a live protocol for public comment and improvement via to maximally empower the research community. Together these efforts provide a robust foundation for VT researchers, and establish VT + Grow as a promising scalable technology to capture and characterize viruses from mixed community source samples that infect cultivable bacteria.

    more » « less
  6. null (Ed.)
    Abstract Microbial sulfur metabolism contributes to biogeochemical cycling on global scales. Sulfur metabolizing microbes are infected by phages that can encode auxiliary metabolic genes (AMGs) to alter sulfur metabolism within host cells but remain poorly characterized. Here we identified 191 phages derived from twelve environments that encoded 227 AMGs for oxidation of sulfur and thiosulfate ( dsrA , dsrC/tusE , soxC , soxD and soxYZ ). Evidence for retention of AMGs during niche-differentiation of diverse phage populations provided evidence that auxiliary metabolism imparts measurable fitness benefits to phages with ramifications for ecosystem biogeochemistry. Gene abundance and expression profiles of AMGs suggested significant contributions by phages to sulfur and thiosulfate oxidation in freshwater lakes and oceans, and a sensitive response to changing sulfur concentrations in hydrothermal environments. Overall, our study provides fundamental insights on the distribution, diversity, and ecology of phage auxiliary metabolism associated with sulfur and reinforces the necessity of incorporating viral contributions into biogeochemical configurations. 
    more » « less
  7. null (Ed.)
    Abstract Background Viruses are a significant player in many biosphere and human ecosystems, but most signals remain “hidden” in metagenomic/metatranscriptomic sequence datasets due to the lack of universal gene markers, database representatives, and insufficiently advanced identification tools. Results Here, we introduce VirSorter2, a DNA and RNA virus identification tool that leverages genome-informed database advances across a collection of customized automatic classifiers to improve the accuracy and range of virus sequence detection. When benchmarked against genomes from both isolated and uncultivated viruses, VirSorter2 uniquely performed consistently with high accuracy (F1-score > 0.8) across viral diversity, while all other tools under-detected viruses outside of the group most represented in reference databases (i.e., those in the order Caudovirales ). Among the tools evaluated, VirSorter2 was also uniquely able to minimize errors associated with atypical cellular sequences including eukaryotic genomes and plasmids. Finally, as the virosphere exploration unravels novel viral sequences, VirSorter2’s modular design makes it inherently able to expand to new types of viruses via the design of new classifiers to maintain maximal sensitivity and specificity. Conclusion With multi-classifier and modular design, VirSorter2 demonstrates higher overall accuracy across major viral groups and will advance our knowledge of virus evolution, diversity, and virus-microbe interaction in various ecosystems. Source code of VirSorter2 is freely available ( ), and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( ). 
    more » « less
  8. Abstract

    Microbial communities in oxygen minimum zones (OMZs) are known to have significant impacts on global biogeochemical cycles, but viral influence on microbial processes in these regions are much less studied. Here we provide baseline ecological patterns using microscopy and viral metagenomics from the Eastern Tropical North Pacific (ETNP) OMZ region that enhance our understanding of viruses in these climate-critical systems. While extracellular viral abundance decreased below the oxycline, viral diversity and lytic infection frequency remained high within the OMZ, demonstrating that viral influences on microbial communities were still substantial without the detectable presence of oxygen. Viral community composition was strongly related to oxygen concentration, with viral populations in low-oxygen portions of the water column being distinct from their surface layer counterparts. However, this divergence was not accompanied by the expected differences in viral-encoded auxiliary metabolic genes (AMGs) relating to nitrogen and sulfur metabolisms that are known to be performed by microbial communities in these low-oxygen and anoxic regions. Instead, several abundant AMGs were identified in the oxycline and OMZ that may modulate host responses to low-oxygen stress. We hypothesize that this is due to selection for viral-encoded genes that influence host survivability rather than modulating host metabolic reactions within the ETNP OMZ. Together, this study shows that viruses are not only diverse throughout the water column in the ETNP, including the OMZ, but their infection of microorganisms has the potential to alter host physiological state within these biogeochemically important regions of the ocean.

    more » « less
  9. null (Ed.)
    Abstract The marine picoeukaryote Bathycoccus prasinos has been considered a cosmopolitan alga, although recent studies indicate two ecotypes exist, Clade BI ( B. prasinos ) and Clade BII. Viruses that infect Bathycoccus Clade BI are known (BpVs), but not that infect BII. We isolated three dsDNA prasinoviruses from the Sargasso Sea against Clade BII isolate RCC716. The BII-Vs do not infect BI, and two (BII-V2 and BII-V3) have larger genomes (~210 kb) than BI-Viruses and BII-V1. BII-Vs share ~90% of their proteins, and between 65% to 83% of their proteins with sequenced BpVs. Phylogenomic reconstructions and PolB analyses establish close-relatedness of BII-V2 and BII-V3, yet BII-V2 has 10-fold higher infectivity and induces greater mortality on host isolate RCC716. BII-V1 is more distant, has a shorter latent period, and infects both available BII isolates, RCC716 and RCC715, while BII-V2 and BII-V3 do not exhibit productive infection of the latter in our experiments. Global metagenome analyses show Clade BI and BII algal relative abundances correlate positively with their respective viruses. The distributions delineate BI/BpVs as occupying lower temperature mesotrophic and coastal systems, whereas BII/BII-Vs occupy warmer temperature, higher salinity ecosystems. Accordingly, with molecular diagnostic support, we name Clade BII Bathycoccus calidus sp. nov. and propose that molecular diversity within this new species likely connects to the differentiated host-virus dynamics observed in our time course experiments. Overall, the tightly linked biogeography of Bathycoccus host and virus clades observed herein supports species-level host specificity, with strain-level variations in infection parameters. 
    more » « less