Abstract Exploring the diversity of diazotrophs is key to understanding their role in supplying fixed nitrogen that supports marine productivity. A nested PCR assay using the universal primer set nifH1-nifH4, which targets the nitrogenase (nifH) gene, is a widely used approach for studying marine diazotrophs by amplicon sequencing. Metagenomics, direct sequencing of DNA without PCR, has provided complementary views of the diversity of marine diazotrophs. A significant fraction of the metagenome-derived nifH sequences (e.g. Planctomycete- and Proteobacteria-affiliated) were reported to have nucleotide mismatches with the nifH1-nifH4 primers, leading to the suggestion that nifH amplicon sequencing does not detect specific diazotrophic taxa and underrepresents diazotroph diversity. Here, we report that these mismatches are mostly located in a single-base at the 5′-end of the nifH4 primer, which does not impact detection of the nifH genes. This is demonstrated by the presence of nifH genes that contain the nucleotide mismatches in a recent compilation of global ocean nifH amplicon datasets, with high relative abundances detected in a variety of samples. While the metagenome- and metatranscriptome-derived nifH genes accounted for 4.4% of the total amplicon sequence variants from the global ocean nifH amplicon database, the corresponding amplicon sequence variants can have high relative abundances (accounting for 47% of the reads in the database). These analyses underscore that nifH amplicon sequencing using the nifH1-nifH4 primers is an important tool for studying diversity of marine diazotrophs, particularly as a complement to metagenomics which can provide taxonomic and metabolic information for some dominant groups.
more »
« less
vAMPirus : A versatile amplicon processing and analysis program for studying viruses
Abstract Amplicon sequencing is an effective and increasingly applied method for studying viral communities in the environment. Here, we present vAMPirus, a user‐friendly, comprehensive, and versatile DNA and RNA virus amplicon sequence analysis program, designed to support investigators in exploring virus amplicon sequencing data and running informed, reproducible analyses. vAMPirus intakes raw virus amplicon libraries and, by default, performs nucleotide‐ and amino acid‐based analyses to produce results such as sequence abundance information, taxonomic classifications, phylogenies and community diversity metrics. The vAMPirus analytical framework leverages 16 different opensource tools and provides optional approaches that can increase the ratio of biological signal‐to‐noise and thereby reveal patterns that would have otherwise been masked. Here, we validate the vAMPirus analytical framework and illustrate its implementation as a general virus amplicon sequencing workflow by recapitulating findings from two previously published double‐stranded DNA virus datasets. As a case study, we also apply the program to explore the diversity and distribution of a coral reef‐associated RNA virus. vAMPirus is streamlined within Nextflow, offering straightforward scalability, standardization and communication of virus lineage‐specific analyses. The vAMPirus framework is designed to be adaptable; community‐driven analytical standards will continue to be incorporated as the field advances. vAMPirus supports researchers in revealing patterns of virus diversity and population dynamics in nature, while promoting study reproducibility and comparability.
more »
« less
- PAR ID:
- 10566358
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Molecular Ecology Resources
- Volume:
- 24
- Issue:
- 6
- ISSN:
- 1755-098X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Although our understanding of the microbial diversity found within a given system expands as amplicon sequencing improves, technical aspects still drastically affect which members can be detected. Compared with prokaryotic members, the eukaryotic microorganisms associated with a host are understudied due to their underrepresentation in ribosomal databases, lower abundance compared with bacterial sequences, and higher ribosomal gene identity to their eukaryotic host. Peptide nucleic acid (PNA) blockers are often designed to reduce amplification of host DNA. Here we present a tool for PNA design called the Microbiome Amplification Preference Tool (MAPT). We examine the effectiveness of a PNA designed to block genomic Medicago sativa DNA (gPNA) compared with unrelated surrounding plants from the same location. We applied mitochondrial PNA and plastid PNA to block the majority of DNA from plant mitochondria and plastid 16S ribosomal RNA genes, as well as the novel gPNA. Until now, amplifying both eukaryotic and prokaryotic reads using 515F-Y and 926R has not been applied to a host. We investigate the efficacy of this gPNA using three approaches: (i) in silico prediction of blocking potential in MAPT, (ii) amplicon sequencing with and without the addition of PNAs, and (iii) comparison with cultured fungal representatives. When gPNA is added during amplicon library preparation, the diversity of unique eukaryotic amplicon sequence variants present in M. sativa increases. We provide a layered examination of the costs and benefits of using PNAs during sequencing. The application of MAPT enables scientists to design PNAs specifically to enable capturing greater diversity in their system.more » « less
-
Elkins, Christopher A (Ed.)ABSTRACT Municipal wastewater harbors diverse RNA viruses, which are responsible for many emerging and reemerging diseases in humans, animals, and plants. Although genomic sequencing can be a high-throughput approach for profiling the RNA virome in wastewater, wastewater processing methods often influence sequencing outcomes. Here, we systematically evaluated two wastewater processing methods, tangential-flow ultrafiltration (TFF) and Nanotrap Microbiome A Particles, for detecting the target RNA virus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) via amplicon sequencing and characterizing the RNA virome using whole-transcriptome shotgun sequencing. Our results from paired comparison tests showed that the TFF and Nanotrap methods recovered similar SARS-CoV-2 variants at the lineage level (analysis of similarity [ANOSIM]R= −0.012,P= 0.874). Optimizing automated procedures for the Nanotrap method and concentration factors for the TFF method was critical for achieving high-depth and high-breadth coverage of the target virus genome. Notably, the two methods enriched distinct RNA viromes from the same wastewater samples (ANOSIMR= 0.260,P= 0.002), with TFF samples showing 22-fold and 7-fold higher relative abundances ofReoviridaeandCoronaviridae, respectively. These differences are likely due to the distinct virus concentration mechanisms employed by each method, which are influenced by liquid-solid partitioning of virus particles and interactions of viral surface proteins with ligands. Our findings underscore the importance of optimizing wastewater processing methods for genomic monitoring and have implications for broader environmental applications.IMPORTANCEWastewater genomic sequencing is an emerging technology for tracking viral infections within communities. However, different methods for concentrating viruses and extracting nucleic acids can influence the recoveries of RNA virome from wastewater. An in-depth understanding of virus concentration mechanisms and their impact on sequencing data quality and bioinformatic output would be critical to guide method selection and optimization. Specifically, this study systematically evaluated tangential-flow ultrafiltration and Nanotrap microbiome particles for their application to sequence SARS-CoV-2 and whole RNA virome from wastewater. Both methods yielded high-quality sequencing data for amplicon sequencing of SARS-CoV-2, but their outcomes diverged in the recovered RNA virome. We identified RNA viruses that are preferentially recovered by each of these two methods and proposed considerations of method selection for future studies of wastewater RNA virome.more » « less
-
Abstract Many applications in molecular ecology require the ability to match specific DNA sequences from single‐ or mixed‐species samples with a diagnostic reference library. Widely used methods for DNA barcoding and metabarcoding employ PCR and amplicon sequencing to identify taxa based on target sequences, but the target‐specific enrichment capabilities of CRISPR‐Cas systems may offer advantages in some applications. We identified 54,837 CRISPR‐Cas guide RNAs that may be useful for enriching chloroplast DNA across phylogenetically diverse plant species. We tested a subset of 17 guide RNAs in vitro to enrich plant DNA strands ranging in size from diagnostic DNA barcodes of 1,428 bp to entire chloroplast genomes of 121,284 bp. We used an Oxford Nanopore sequencer to evaluate sequencing success based on both single‐ and mixed‐species samples, which yielded mean chloroplast sequence lengths of 2,530–11,367 bp, depending on the experiment. In comparison to mixed‐species experiments, single‐species experiments yielded more on‐target sequence reads and greater mean pairwise identity between contigs and the plant species' reference genomes. But nevertheless, these mixed‐species experiments yielded sufficient data to provide ≥48‐fold increase in sequence length and better estimates of relative abundance for a commercially prepared mixture of plant species compared to DNA metabarcoding based on the chloroplasttrnL‐P6 marker. Prior work developed CRISPR‐based enrichment protocols for long‐read sequencing and our experiments pioneered its use for plant DNA barcoding and chloroplast assemblies that may have advantages over workflows that require PCR and short‐read sequencing. Future work would benefit from continuing to develop in vitro and in silico methods for CRISPR‐based analyses of mixed‐species samples, especially when the appropriate reference genomes for contig assembly cannot be known a priori.more » « less
-
ABSTRACT DNA metabarcoding of zooplankton biodiversity is used increasingly for monitoring global ocean ecosystems, requiring comparable data from different research laboratories and ocean regions. The MetaZooGene Intercalibration Experiment (MZG‐ICE) was designed to examine1 and analyse patterns of variation of DNA sequence data resulting from multi‐gene metabarcoding of 10 zooplankton samples carried out by 10 research groups affiliated with the Scientific Committee for Ocean Research (SCOR). Aliquots of DNA extracted from the 10 zooplankton samples were distributed to MZG‐ICE groups for metabarcoding of four gene regions: V1‐V2, V4 and V9 of nuclear 18S rRNA and mitochondrial COI. Molecular protocols and procedures were recommended; substitutions were allowed as necessary. Resulting data were uploaded to a common repository for centralised statistics and bioinformatics. Based on proportional sequence numbers for abundant phyla, overall patterns of variation were consistent across many—but not all—MZG‐ICE groups. V9 showed highest similarity, followed (in order) by V4, V1‐V2, and COI. Outlier data were hypothesised to result from the use of different PCR protocols and sequencing platforms, and possible contamination. MZG‐ICE results indicated that DNA metabarcoding data from different laboratories and research groups can provide reliable, accurate and valid descriptions of biodiversity of zooplankton throughout the ocean. Recommendations included: pre‐screening QA/QC of raw data, detailed records for laboratory protocols, reagents, and instrumentation, and centralised bioinformatics and multivariate statistics. In the absence of universal agreement on standardised protocols or best practices, intercalibration is the best way forward toward validation of DNA metabarcoding of zooplankton diversity for global ocean monitoring.more » « less
An official website of the United States government

