skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on February 28, 2026

Title: Global Availability of Plant DNA Barcodes as Genomic Resources to Support Basic and Policy‐Relevant Biodiversity Research
ABSTRACT Genetic technologies such as DNA barcoding make it easier and less expensive to monitor biodiversity and its associated ecosystem services, particularly in biodiversity hotspots where traditional assessments are challenging. Successful use of these data‐driven technologies, however, requires access to appropriate reference data. We reviewed the >373,584 reference plant DNA barcodes in public repositories and found that they cumulatively cover a remarkable quarter of the ~435,000 extant land plant species (Embryophyta). Nevertheless, coverage gaps in tropical biodiversity hotspots reflect well‐documented biases in biodiversity science – most reference specimens originated in the Global North. Currently, at least 17% of plant families lack any reference barcode data whatsoever, affecting tropical and temperate regions alike. Investigators often emphasise the importance of marker choice and the need to ensure protocols are technically capable of detecting and identifying a broad range of taxa. Yet persistent geographic and taxonomic gaps in the reference datasets show that these protocols rely upon risk undermining all downstream applications of the strategy, ranging from basic biodiversity monitoring to policy‐relevant objectives – such as the forensic authentication of materials in illegal trade. Future networks of investigators could work strategically to improve data coverage, which will be essential in global efforts to conserve biodiversity while advancing more fair and equitable access to benefits arising from genetic resources.  more » « less
Award ID(s):
2046797 2025816 2026294
PAR ID:
10576466
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Molecular Ecology
Volume:
34
Issue:
7
ISSN:
0962-1083
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary Biodiversity knowledge gaps and biases persist across low-income tropical regions. Genetic data are essential for addressing these issues, supporting biodiversity research and conservation planning. To assess progress in wildlife genetic sampling within the Philippines, I evaluated the scope, representativeness, and growth of publicly available genetic data and research on endemic vertebrates from the 1990s through 2024. Results showed that 82.3% of the Philippines’ 769 endemic vertebrates have genetic data, although major disparities remain. Reptiles had the least complete coverage but exhibited the highest growth, with birds, mammals, and amphibians following in that order. Species confined to smaller biogeographic subregions, with narrow geographic ranges, or classified as threatened or lacking threat assessments were disproportionately underrepresented. Research output on reptiles increased markedly, while amphibian research lagged behind. Although the number of non-unique authors in wildlife genetics studies involving Philippine specimens has grown steeply, Filipino involvement remains low. These results highlight the uneven and non-random distribution of wildlife genetic knowledge within this global biodiversity hotspot. Moreover, the limited participation of Global South researchers underscores broader inequities in wildlife genomics. Closing these gaps and addressing biases creates a more equitable and representative genetic knowledge base and supports its integration into national conservation efforts aligned with global biodiversity commitments. 
    more » « less
  2. Abstract AimAddressing global environmental challenges requires access to biodiversity data across wide spatial, temporal and taxonomic scales. Availability of such data has increased exponentially recently with the proliferation of biodiversity databases. However, heterogeneous coverage, protocols, and standards have hampered integration among these databases. To stimulate the next stage of data integration, here we present a synthesis of major databases, and investigate (a) how the coverage of databases varies across taxonomy, space, and record type; (b) what degree of integration is present among databases; (c) how integration of databases can increase biodiversity knowledge; and (d) the barriers to database integration. LocationGlobal. Time periodContemporary. Major taxa studiedPlants and vertebrates. MethodsWe reviewed 12 established biodiversity databases that mainly focus on geographic distributions and functional traits at global scale. We synthesized information from these databases to assess the status of their integration and major knowledge gaps and barriers to full integration. We estimated how improved integration can increase the data coverage for terrestrial plants and vertebrates. ResultsEvery database reviewed had a unique focus of data coverage. Exchanges of biodiversity information were common among databases, although not always clearly documented. Functional trait databases were more isolated than those pertaining to species distributions. Variation and potential incompatibility of taxonomic systems used by different databases posed a major barrier to data integration. We found that integration of distribution databases could lead to increased taxonomic coverage that corresponds to 23 years’ advancement in data accumulation, and improvement in taxonomic coverage could be as high as 22.4% for trait databases. Main conclusionsRapid increases in biodiversity knowledge can be achieved through the integration of databases, providing the data necessary to address critical environmental challenges. Full integration across databases will require tackling the major impediments to data integration: taxonomic incompatibility, lags in data exchange, barriers to effective data synchronization, and isolation of individual initiatives. 
    more » « less
  3. Abstract Many applications in molecular ecology require the ability to match specific DNA sequences from single‐ or mixed‐species samples with a diagnostic reference library. Widely used methods for DNA barcoding and metabarcoding employ PCR and amplicon sequencing to identify taxa based on target sequences, but the target‐specific enrichment capabilities of CRISPR‐Cas systems may offer advantages in some applications. We identified 54,837 CRISPR‐Cas guide RNAs that may be useful for enriching chloroplast DNA across phylogenetically diverse plant species. We tested a subset of 17 guide RNAs in vitro to enrich plant DNA strands ranging in size from diagnostic DNA barcodes of 1,428 bp to entire chloroplast genomes of 121,284 bp. We used an Oxford Nanopore sequencer to evaluate sequencing success based on both single‐ and mixed‐species samples, which yielded mean chloroplast sequence lengths of 2,530–11,367 bp, depending on the experiment. In comparison to mixed‐species experiments, single‐species experiments yielded more on‐target sequence reads and greater mean pairwise identity between contigs and the plant species' reference genomes. But nevertheless, these mixed‐species experiments yielded sufficient data to provide ≥48‐fold increase in sequence length and better estimates of relative abundance for a commercially prepared mixture of plant species compared to DNA metabarcoding based on the chloroplasttrnL‐P6 marker. Prior work developed CRISPR‐based enrichment protocols for long‐read sequencing and our experiments pioneered its use for plant DNA barcoding and chloroplast assemblies that may have advantages over workflows that require PCR and short‐read sequencing. Future work would benefit from continuing to develop in vitro and in silico methods for CRISPR‐based analyses of mixed‐species samples, especially when the appropriate reference genomes for contig assembly cannot be known a priori. 
    more » « less
  4. Abstract DNA‐based aquatic biomonitoring methods show promise to provide rapid, standardized, and efficient biodiversity assessment to supplement and in some cases replace current morphology‐based approaches that are often less efficient and can produce inconsistent results. Despite this potential, broad‐scale adoption of DNA‐based approaches by end‐users remains limited, and studies on how these two approaches differ in detecting aquatic biodiversity across large spatial scales are lacking. Here, we present a comparison of DNA metabarcoding and morphological identification, leveraging national‐scale, open‐source, ecological datasets from the National Ecological Observatory Network (NEON). Across 24 wadeable streams in North America with 179 paired sample comparisons, we found that DNA metabarcoding detected twice as many unique taxa than morphological identification overall. The two approaches showed poor congruence in detecting the same taxa, averaging 59%, 35%, and 23% of shared taxa detected at the order, family, and genus levels, respectively. Importantly, the two approaches detected different proportions of indicator taxa like %EPT and %Chironomidae. DNA metabarcoding detected far fewer Chironomid and Trichopteran taxa than morphological identification, but more Ephemeropteran and Plecopteran taxa, a result likely due to primer choice. Overall, our results showed that DNA metabarcoding and morphological identification detected different benthic macroinvertebrate communities. Despite these differences, we found that the same environmental variables were correlated with invertebrate community structure, suggesting that both approaches can accurately detect biodiversity patterns across environmental gradients. Further refinement of DNA metabarcoding protocols, primers, and reference libraries–as well as more standardized, large‐scale comparative studies–may improve our understanding of the taxonomic agreement and data linkages between DNA metabarcoding and morphological approaches. 
    more » « less
  5. Abstract Intraspecific genetic diversity is a key aspect of biodiversity. Quaternary climatic change and glaciation influenced intraspecific genetic diversity by promoting range shifts and population size change. However, the extent to which glaciation affected genetic diversity on a global scale is not well established. Here we quantify nucleotide diversity, a common metric of intraspecific genetic diversity, in more than 38,000 plant and animal species using georeferenced DNA sequences from millions of samples. Results demonstrate that tropical species contain significantly more intraspecific genetic diversity than nontropical species. To explore potential evolutionary processes that may have contributed to this pattern, we calculated summary statistics that measure population demographic change and detected significant correlations between these statistics and latitude. We find that nontropical species are more likely to deviate from neutral expectations, indicating that they have historically experienced dramatic fluctuations in population size likely associated with Pleistocene glacial cycles. By analyzing the most comprehensive data set to date, our results imply that Quaternary climate perturbations may be more important as a process driving the latitudinal gradient in species richness than previously appreciated. 
    more » « less