Biodiversity genomics research requires reliable organismal identification, which can be difficult based on morphology alone. DNA-based identification using DNA barcoding can provide confirmation of species identity and resolve taxonomic issues but is rarely used in studies generating reference genomes. Here, we describe the development and implementation of DNA barcoding for the Darwin Tree of Life Project (DToL), which aims to sequence and assemble high quality reference genomes for all eukaryotic species in Britain and Ireland. We present a standardised framework for DNA barcode sequencing and data interpretation that is then adapted for diverse organismal groups. DNA barcoding data from over 12,000 DToL specimens has identified up to 20% of samples requiring additional verification, with 2% of seed plants and 3.5% of animal specimens subsequently having their names changed. We also make recommendations for future developments using new sequencing approaches and streamlined bioinformatic approaches. 
                        more » 
                        « less   
                    
                            
                            Most soil and litter arthropods are unidentifiable based on current DNA barcode reference libraries
                        
                    
    
            Abstract We are far from knowing all species living on the planet. Understanding biodiversity is demanding and requires time and expertise. Most groups are understudied given problems of identifying and delimiting species. DNA barcoding emerged to overcome some of the difficulties in identifying species. Its limitations derive from incomplete taxonomic knowledge and the lack of comprehensive DNA barcode libraries for so many taxonomic groups. Here, we evaluate how useful barcoding is for identifying arthropods from highly diverse leaf litter communities in the southern Appalachian Mountains (USA). We used 3 reference databases and several automated classification methods on a data set including several arthropod groups. Acari, Araneae, Collembola, Coleoptera, Diptera, and Hymenoptera were well represented, showing different performances across methods and databases. Spiders performed the best, with correct identification rates to species and genus levels of ~50% across databases. Springtails performed poorly, no barcodes were identified to species or genus. Other groups showed poor to mediocre performance, from around 3% (mites) to 20% (beetles) correctly identified barcodes to species, but also with some false identifications. In general, BOLD-based identification offered the best identification results but, in all cases except spiders, performance is poor, with less than a fifth of specimens correctly identified to genus or species. Our results indicate that the soil arthropod fauna is still insufficiently documented, with many species unrepresented in DNA barcode libraries. More effort toward integrative taxonomic characterization is needed to complete our reference libraries before we can rely on DNA barcoding as a universally applicable identification method. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1916263
- PAR ID:
- 10487121
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Current Zoology
- ISSN:
- 1674-5507
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Characterization of species diversity of zooplankton is key to understanding, assessing, and predicting the function and future of pelagic ecosystems throughout the global ocean. The marine zooplankton assemblage, including only metazoans, is highly diverse and taxonomically complex, with an estimated ~28,000 species of 41 major taxonomic groups. This review provides a comprehensive summary of DNA sequences for the barcode region of mitochondrial cytochrome oxidase I (COI) for identified specimens. The foundation of this summary is the MetaZooGene Barcode Atlas and Database (MZGdb), a new open-access data and metadata portal that is linked to NCBI GenBank and BOLD data repositories. The MZGdb provides enhanced quality control and tools for assembling COI reference sequence databases that are specific to selected taxonomic groups and/or ocean regions, with associated metadata (e.g., collection georeferencing, verification of species identification, molecular protocols), and tools for statistical analysis, mapping, and visualization. To date, over 150,000 COI sequences for ~ 5600 described species of marine metazoan plankton (including holo- and meroplankton) are available via the MZGdb portal. This review uses the MZGdb as a resource for summaries of COI barcode data and metadata for important taxonomic groups of marine zooplankton and selected regions, including the North Atlantic, Arctic, North Pacific, and Southern Oceans. The MZGdb is designed to provide a foundation for analysis of species diversity of marine zooplankton based on DNA barcoding and metabarcoding for assessment of marine ecosystems and rapid detection of the impacts of climate change.more » « less
- 
            Abstract Because of the detrimental effects of terrestrial invasive plant species (TIPS) on native species, ecosystems, public health, and the economy, many countries have been actively looking for strategies to prevent the introduction and minimize the spread of TIPS. Fast and accurate detection of TIPS is essential to achieving these goals. Conventionally, invasive species monitoring has relied on morphological attributes. Recently, DNA‐based species identification (i.e., DNA barcoding) has become more attractive. To investigate whether DNA barcoding can aid in the detection and management of TIPS, we visited multiple nature areas in Southwest Michigan and collected a small piece of leaf tissue from 91 representative terrestrial plant species, most of which are invasive. We extracted DNA from the leaf samples, amplified four genomic loci (ITS,rbcL,matK, andtrnH‐psbA) with PCR, and then purified and sequenced the PCR products. After careful examination of the sequencing data, we were able to identify reliable DNA barcode regions for most species and had an average PCR‐and‐sequencing success rate of 87.9%. We found that the species discrimination rate of a DNA barcode region is inversely related to the ease of PCR amplification and sequencing. Compared withrbcLandmatK, ITS andtrnH‐psbAhave better species discrimination rates (80.6% and 63.2%, respectively). When ITS andtrnH‐psbAare simultaneously used, the species discrimination rate increases to 97.1%. The high species/genus/family discrimination rates of DNA barcoding indicate that DNA barcoding can be successfully employed in TIPS identification. Further increases in the number of DNA barcode regions show little or no additional increases in the species discrimination rate, suggesting that dual‐barcode approaches (e.g., ITS + trnH‐psbA) might be the efficient and cost‐effective method in DNA‐based TIPS identification. Close inspection of nucleotide sequences at the four DNA barcode regions among related species demonstrates that DNA barcoding is especially useful in identifying TIPS that are morphologically similar to other species.more » « less
- 
            Abstract Genetic tools are increasingly used to identify and discriminate between species. One key transition in this process was the recognition of the potential of the ca 658bp fragment of the organelle cytochrome c oxidase I (COI) as a barcode region, which revolutionized animal bioidentification and lead, among others, to the instigation of the Barcode of Life Database (BOLD), containing currently barcodes from >7.9 million specimens. Following this discovery, suggestions for other organellar regions and markers, and the primers with which to amplify them, have been continuously proposed. Most recently, the field has taken the leap from PCR‐based generation of DNA references into shotgun sequencing‐based “genome skimming” alternatives, with the ultimate goal of assembling organellar reference genomes. Unfortunately, in genome skimming approaches, much of the nuclear genome (as much as 99% of the sequence data) is discarded, which is not only wasteful, but can also limit the power of discrimination at, or below, the species level. Here, we advocate that the full shotgun sequence data can be used to assign an identity (that we term for convenience its “DNA‐mark”) for both voucher and query samples, without requiring any computationally intensive pretreatment (e.g. assembly) of reads. We argue that if reference databases are populated with such “DNA‐marks,” it will enable future DNA‐based taxonomic identification to complement, or even replace PCR of barcodes with genome skimming, and we discuss how such methodology ultimately could enable identification to population, or even individual, level.more » « less
- 
            Abstract Random DNA barcodes are a versatile tool for tracking cell lineages, with applications ranging from development to cancer to evolution. Here, we review and critically evaluate barcode designs as well as methods of barcode sequencing and initial processing of barcode data. We first demonstrate how various barcode design decisions affect data quality and propose a new design that balances all considerations that we are currently aware of. We then discuss various options for the preparation of barcode sequencing libraries, including inline indices and Unique Molecular Identifiers (UMIs). Finally, we test the performance of several established and new bioinformatic pipelines for the extraction of barcodes from raw sequencing reads and for error correction. We find that both alignment and regular expression-based approaches work well for barcode extraction, and that error-correction pipelines designed specifically for barcode data are superior to generic ones. Overall, this review will help researchers to approach their barcoding experiments in a deliberate and systematic way.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
