Over 300 million arthropod specimens are housed in North American natural history collections. These collections represent a “vast hidden treasure trove” of biodiversity −95% of the specimen label data have yet to be transcribed for research, and less than 2% of the specimens have been imaged. Specimen labels contain crucial information to determine species distributions over time and are essential for understanding patterns of ecology and evolution, which will help assess the growing biodiversity crisis driven by global change impacts. Specimen images offer indispensable insight and data for analyses of traits, and ecological and phylogenetic patterns of biodiversity. Here, we review North American arthropod collections using two key metrics, specimen holdings and digitization efforts, to assess the potential for collections to provide needed biodiversity data. We include data from 223 arthropod collections in North America, with an emphasis on the United States. Our specific findings are as follows: (1) The majority of North American natural history collections (88%) and specimens (89%) are located in the United States. Canada has comparable holdings to the United States relative to its estimated biodiversity. Mexico has made the furthest progress in terms of digitization, but its specimen holdings should be increased to reflect the estimated higher Mexican arthropod diversity. The proportion of North American collections that has been digitized, and the number of digital records available per species, are both much lower for arthropods when compared to chordates and plants. (2) The National Science Foundation’s decade-long ADBC program (Advancing Digitization of Biological Collections) has been transformational in promoting arthropod digitization. However, even if this program became permanent, at current rates, by the year 2050 only 38% of the existing arthropod specimens would be digitized, and less than 1% would have associated digital images. (3) The number of specimens in collections has increased by approximately 1% per year over the past 30 years. We propose that this rate of increase is insufficient to provide enough data to address biodiversity research needs, and that arthropod collections should aim to triple their rate of new specimen acquisition. (4) The collections we surveyed in the United States vary broadly in a number of indicators. Collectively, there is depth and breadth, with smaller collections providing regional depth and larger collections providing greater global coverage. (5) Increased coordination across museums is needed for digitization efforts to target taxa for research and conservation goals and address long-term data needs. Two key recommendations emerge: collections should significantly increase both their specimen holdings and their digitization efforts to empower continental and global biodiversity data pipelines, and stimulate downstream research.
more »
« less
Digitization and the Future of Natural History Collections
Abstract Natural history collections (NHCs) are the foundation of historical baselines for assessing anthropogenic impacts on biodiversity. Along these lines, the online mobilization of specimens via digitization—the conversion of specimen data into accessible digital content—has greatly expanded the use of NHC collections across a diversity of disciplines. We broaden the current vision of digitization (Digitization 1.0)—whereby specimens are digitized within NHCs—to include new approaches that rely on digitized products rather than the physical specimen (Digitization 2.0). Digitization 2.0 builds on the data, workflows, and infrastructure produced by Digitization 1.0 to create digital-only workflows that facilitate digitization, curation, and data links, thus returning value to physical specimens by creating new layers of annotation, empowering a global community, and developing automated approaches to advance biodiversity discovery and conservation. These efforts will transform large-scale biodiversity assessments to address fundamental questions including those pertaining to critical issues of global change.
more »
« less
- PAR ID:
- 10164954
- Date Published:
- Journal Name:
- BioScience
- Volume:
- 70
- Issue:
- 3
- ISSN:
- 0006-3568
- Page Range / eLocation ID:
- 243 to 251
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Collections digitization relies increasingly upon computational and data management resources that occasionally exceed the capacity of natural history collections and their managers and curators. Digitization of many tens of thousands of micropaleontological specimen slides, as evidenced by the effort presented here by the Indiana University Paleontology Collection, has been a concerted effort in adherence to the recommended practices of multifaceted aspects of collections management for both physical and digital collections resources. This presentation highlights the contributions of distributed cyberinfrastructure from the National Science Foundation-supported Extreme Science and Engineering Discovery Environment (XSEDE) for web-hosting of collections management system resources and distributed processing of millions of digital images and metadata records of specimens from our collections. The Indiana University Center for Biological Research Collections is currently hosting its instance of the Specify collections management system (CMS) on a virtual server hosted on Jetstream, the cloud service for on-demand computational resources as provisioned by XSEDE. This web-service allows the CMS to be flexibly hosted on the cloud with additional services that can be provisioned on an as-needed basis for generating and integrating digitized collections objects in both web-friendly and digital preservation contexts. On-demand computing resources can be used for the manipulation of digital images for automated file I/O, scripted renaming of files for adherence to file naming conventions, derivative generation, and backup to our local tape archive for digital disaster preparedness and long-term storage. Here, we will present our strategies for facilitating reproducible workflows for general collections digitization of the IUPC nomenclatorial types and figured specimens in addition to the gigapixel resolution photographs of our large collection of microfossils using our GIGAmacro system (e.g., this slide of conodonts). We aim to demonstrate the flexibility and nimbleness of cloud computing resources for replicating this, and other, workflows to enhance the findability, accessibility, interoperability, and reproducibility of the data and metadata contained within our collections.more » « less
-
PremiseThe digitization of natural history collections includes transcribing specimen label data into standardized formats. Born‐digital specimen data initially gathered in digital formats do not need to be transcribed, enabling their efficient integration into digitized collections. Modernizing field collection methods for born‐digital workflows requires the development of new tools and processes. Methods and ResultscollNotes, a mobile application, was developed for Android andiOSto supplement traditional field journals. Designed for efficiency in the field, collNotes avoids redundant data entries and does not require cellular service. collBook, a companion desktop application, refines field notes into database‐ready formats and produces specimen labels. ConclusionscollNotes and collBook can be used in combination as a field‐to‐database solution for gathering born‐digital voucher specimen data for plants and fungi. Both programs are open source and use common file types simplifying either program's integration into existing workflows.more » « less
-
Abstract The widespread digitization of natural history collections, combined with novel tools and approaches is revolutionizing biodiversity science. The ‘extended specimen’ concept advocates a more holistic approach in which a specimen is framed as a diverse stream of interconnected data. Herbarium specimens that by their very nature capture multispecies relationships, such as certain parasites, fungi and lichens, hold great potential to provide a broader and more integrative view of the ecology and evolution of symbiotic interactions. This particularly applies to parasite–host associations, which owing to their interconnectedness are especially vulnerable to global environmental change.Here, we present an overview of how parasitic flowering plants is represented in herbarium collections. We then discuss the variety of data that can be gathered from parasitic plant specimens, and how they can be used to understand global change impacts at multiple scales. Finally, we review best practices for sampling parasitic plants in the field, and subsequently preparing and digitizing these specimens.Plant parasitism has evolved 12 times within angiosperms, and similar to other plant taxa, herbarium collections represent the foundation for analysing key aspects of their ecology and evolution. Yet these collections hold far greater potential. Data and metadata obtained from parasitic plant specimens can inform analyses of co‐distribution patterns, changes in eco‐physiology and species plasticity spanning temporal and spatial scales, chemical ecology of tripartite interactions (e.g. host–parasite–herbivore), and molecular data critical for species conservation. Moreover, owing to the historic nature and sheer size of global herbarium collections, these data provide the spatiotemporal breadth essential for investigating organismal response to global change.Parasitic plant specimens are primed to serve as ideal examples of extended specimen concept and help motivate the next generation of creative and impactful collection‐based science. Continued digitization efforts and improved curatorial practices will contribute to opening these specimens to a broader audience, allowing integrative research spanning multiple domains and offering novel opportunities for education.more » « less
-
Native bee species in the United States provide invaluable pollination services. Concerns about native bee declines are growing, and there are calls for a national monitoring program. Documenting species ranges at ecologically meaningful scales through coverage completeness analysis is a fundamental step to track bees from species to communities. It may take decades before all existing bee specimens are digitized, so projections are needed now to focus future research and management efforts. From 1.923 million records, we created range maps for nearly 88% (3158 species) of bee species in the contiguous United States, provided the first analysis of inventory completeness for digitized specimens of a major insect clade, and perhaps most important, estimated spatial completeness accounting for all known bee specimens in USA collections, including undigitized bee specimens. Completeness analyses were very low (3–37%) across four examined spatial resolutions when using the currently available bee specimen records. Adding a subset of observations from community science data sources did not significantly increase completeness, and adding a projected 4.7 million undigitized specimens increased completeness by only an additional 12–13%. Assessments of data, including projected specimen records, indicate persistent taxonomic and geographic deficiencies. In conjunction with expedited digitization, new inventories that integrate community science data with specimen‐based documentation will be required to close these gaps. A combined effort involving both strategic inventories and accelerated digitization campaigns is needed for a more complete understanding of USA bee distributions.more » « less