skip to main content


Title: The Changing Uses of Herbarium Data in an Era of Global Change: An Overview Using Automated Content Analysis
Abstract

Widespread specimen digitization has greatly enhanced the use of herbarium data in scientific research. Publications using herbarium data have increased exponentially over the last century. Here, we review changing uses of herbaria through time with a computational text analysis of 13,702 articles from 1923 to 2017 that quantitatively complements traditional review approaches. Although maintaining its core contribution to taxonomic knowledge, herbarium use has diversified from a few dominant research topics a century ago (e.g., taxonomic notes, botanical history, local observations), with many topics only recently emerging (e.g., biodiversity informatics, global change biology, DNA analyses). Specimens are now appreciated as temporally and spatially extensive sources of genotypic, phenotypic, and biogeographic data. Specimens are increasingly used in ways that influence our ability to steward future biodiversity. As we enter the Anthropocene, herbaria have likewise entered a new era with enhanced scientific, educational, and societal relevance.

 
more » « less
NSF-PAR ID:
10117202
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
BioScience
Volume:
69
Issue:
10
ISSN:
0006-3568
Page Range / eLocation ID:
p. 812-822
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Premise

    Plant biodiversity is threatened, yet many species remain undescribed. It is estimated that >50% of undescribed species have already been collected and are awaiting discovery in herbaria. Robust automatic species identification algorithms using machine learning could accelerate species discovery.

    Methods

    To encourage the development of an automatic species identification algorithm, we submitted our Herbarium 2019 data set to the Fine‐Grained Visual Categorization sub‐competition (FGVC6) hosted on the Kaggle platform. We chose to focus on the flowering plant family Melastomataceae because we have a large collection of imaged herbarium specimens (46,469 specimens representing 683 species) and taxonomic expertise in the family. As is common for herbarium collections, some species in this data set are represented by few specimens and others by many.

    Results

    In less than three months, the FGVC6 Herbarium 2019 Challenge drew 22 teams who entered 254 models for Melastomataceae species identification. The four best algorithms identified species with >88% accuracy.

    Discussion

    The FGVC competitions provide a unique opportunity for computer vision and machine learning experts to address difficult species‐recognition problems. The Herbarium 2019 Challenge brought together a novel combination of collections resources, taxonomic expertise, and collaboration between botanists and computer scientists.

     
    more » « less
  2. Herbarium collections shape our understanding of the world’s flora and are crucial for addressing global change and biodiversity conservation. The formation of such natural history collections, however, are not free from sociopolitical issues of immediate relevance. Despite increasing efforts addressing issues of representation and colonialism in natural history collections, herbaria have received comparatively less attention. While it has been noted that the majority of plant specimens are housed in the global North, the extent of this disparity has not been rigorously quantified to date. Here, by analyzing over 85 million specimen records and surveying herbaria across the globe, we assess the colonial legacy of botanical collections and how we may move towards a more inclusive future. We demonstrate that colonial exploitation has contributed to an inverse relationship between where plant biodiversity exists in nature and where it is housed in herbaria. Such disparities persist in herbaria across physical and digital realms despite overt colonialism having ended over half a century ago, suggesting ongoing digitization and decolonization efforts have yet to alleviate colonial-era discrepancies. We emphasize the need for acknowledging the inconvenient history of herbarium collections and the implementation of a more equitable, global paradigm for their collection, curation, and use. 
    more » « less
  3. Abstract

    The widespread digitization of natural history collections, combined with novel tools and approaches is revolutionizing biodiversity science. The ‘extended specimen’ concept advocates a more holistic approach in which a specimen is framed as a diverse stream of interconnected data. Herbarium specimens that by their very nature capture multispecies relationships, such as certain parasites, fungi and lichens, hold great potential to provide a broader and more integrative view of the ecology and evolution of symbiotic interactions. This particularly applies to parasite–host associations, which owing to their interconnectedness are especially vulnerable to global environmental change.

    Here, we present an overview of how parasitic flowering plants is represented in herbarium collections. We then discuss the variety of data that can be gathered from parasitic plant specimens, and how they can be used to understand global change impacts at multiple scales. Finally, we review best practices for sampling parasitic plants in the field, and subsequently preparing and digitizing these specimens.

    Plant parasitism has evolved 12 times within angiosperms, and similar to other plant taxa, herbarium collections represent the foundation for analysing key aspects of their ecology and evolution. Yet these collections hold far greater potential. Data and metadata obtained from parasitic plant specimens can inform analyses of co‐distribution patterns, changes in eco‐physiology and species plasticity spanning temporal and spatial scales, chemical ecology of tripartite interactions (e.g. host–parasite–herbivore), and molecular data critical for species conservation. Moreover, owing to the historic nature and sheer size of global herbarium collections, these data provide the spatiotemporal breadth essential for investigating organismal response to global change.

    Parasitic plant specimens are primed to serve as ideal examples of extended specimen concept and help motivate the next generation of creative and impactful collection‐based science. Continued digitization efforts and improved curatorial practices will contribute to opening these specimens to a broader audience, allowing integrative research spanning multiple domains and offering novel opportunities for education.

     
    more » « less
  4. Premise

    With digitization and data sharing initiatives underway over the last 15 years, an important need has been prioritizing specimens to digitize. Because duplicate specimens are shared among herbaria in exchange and gift programs, we investigated the extent to which unique biogeographic data are held in small herbaria vs. these data being redundant with those held by larger institutions. We evaluated the unique specimen contributions that small herbaria make to biogeographic understanding at county, locality, and temporal scales.

    Methods

    We sampled herbarium specimens of 40 plant taxa from each of eight states of the United States of America in four broad status categories: extremely rare, very rare, common native, and introduced. We gathered geographic information from specimens held by large (≥100,000 specimens) and small (<100,000 specimens) herbaria. We built generalized linear mixed models to assess which features of the collections may best predict unique contributions of herbaria and used an Akaike information criterion‐based information‐theoretic approach for our model selection to choose the best model for each scale.

    Results

    Small herbaria contributed unique specimens at all scales in proportion with their contribution of specimens to our data set. The best models for all scales were the full models that included the factors of species status and herbarium size when accounting for state as a random variable.

    Conclusions

    We demonstrated that small herbaria contribute unique information for research. It is clear that unique contributions cannot be predicted based on herbarium size alone. We must prioritize digitization and data sharing from herbaria of all sizes.

     
    more » « less
  5. Traditionally, the generation and use of biodiversity data and their associated specimen objects have been primarily the purview of individuals and small research groups. While deposition of data and specimens in herbaria and other repositories has long been the norm, throughout most of their history, these resources have been accessible only to a small community of specialists. Through recent concerted efforts, primarily at the level of national and international governmental agencies over the last two decades, the pace of biodiversity data accumulation has accelerated, and a wider array of biodiversity scientists has gained access to this massive accumulation of resources, applying them to an ever‐widening compass of research pursuits. We review how these new resources and increasing access to them are affecting the landscape of biodiversity research in plants today, focusing on new applications across evolution, ecology, and other fields that have been enabled specifically by the availability of these data and the global scope that was previously beyond the reach of individual investigators. We give an overview of recent advances organized along three lines: broad‐scale analyses of distributional data and spatial information, phylogenetic research circumscribing large clades with comprehensive taxon sampling, and data sets derived from improved accessibility of biodiversity literature. We also review synergies between large data resources and more traditional data collection paradigms, describe shortfalls and how to overcome them, and reflect on the future of plant biodiversity analyses in light of increasing linkages between data types and scientists in our field.

     
    more » « less