skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: “Mobilizing Millions of Mollusks of the Eastern Seaboard”: Digitization activities at the Bailey-Matthews National Shell Museum & Aquarium
"Mobilizing Millions of Mollusks of the Eastern Seaboard" (ESB) is a project sponsored by the National Science Foundation that improves our knowledge of mollusks from the East and Gulf coasts of the US. The four-year project is making taxonomically vetted, and completely georeferenced occurrence data for 535,000 specimen lots representing 4.5 million specimens available online on the iDigBio, GBIF, and OBIS data aggregators. The ESB region includes 18 states, nearly 6,000 km from Maine to Texas. In the ESB project, 17 major US collections, containing 85% of molluscan holdings from the ESB in all US molluscan collections, are collaborating. The ESB project improves reliability of and access to molluscan collection data for examining changes in distribution, morphology, population size, and genetic variation within and across species. The Museum collection had been digitized (cataloged electronically) at the start of the project (including 21,283 ESB lots); accordingly, the main goals of the project were cleaning data (improving the taxonomy, locality, dates, collecting data) and adding geolocation (geographic coordinates) to these lots. In addition, since the beginning of the project, we digitized an additional 3,897 ESB newly acquired lots consisting of 14,500 specimens. Other achievements are cleaning and standardizing collection metadata for 12,730 lots, adding geolocation data for 23,952 lots and photographing 320 lots. Currently, the total number of ESB lots is 25,180, of which 24,201 have geolocation data.  more » « less
Award ID(s):
2001528
PAR ID:
10534646
Author(s) / Creator(s):
;
Editor(s):
Leal, José H
Publisher / Repository:
Bailey-Matthews National Shell Museum & Aquarium
Date Published:
Edition / Version:
1
Volume:
14
Issue:
1
Page Range / eLocation ID:
11-11
Subject(s) / Keyword(s):
Collections databases biodiversity taxonomy eastern seaboard
Format(s):
Medium: X Other: pdf/a
Sponsoring Org:
National Science Foundation
More Like this
  1. Goldfarb, Keith (Ed.)
    Natural history collections are important depositories of biodiversity data. Digital photography of natural history collection specimens and subsequent dissemination of the resulting images on the web allow for the virtual discovery of these specimens, enhancing their accessibility to the target audience and the public in general. This presentation discusses digital photography of marine mollusks in collections, including some of the latest techniques for imaging of very small specimens, photography of specimens preserved in liquid, haptobionts, problems of color retention, transparency, 3-D photography, equipment, and other current areas of interest. Despite the focus on mollusks, the discussions can be extrapolated as generalities applicable to invertebrates from other phyla. The presentation also includes a discussion on equipment and the ideal digital parameters for imaging of natural history collection specimens, including image policies on acceptable file-format requirements for data hosts and aggregators such as iDigBio and others. (The presentation includes work funded in part by the NSF Thematic Collections Network grant award 2001528 “Mobilizing Millions of Mollusks from the Eastern Seaboard”). 
    more » « less
  2. In 2017 NSF funded “oVert (openVertebrate): Open Exploration of Vertebrate Diversity in 3D,” which is the first Thematic Collections Network devoted entirely to vertebrate morphological specimens. The primary goal of oVert is to generate and serve high-resolution digital three-dimensional data for internal anatomy across vertebrate diversity. oVert will CT-scan >20,000 fluid-preserved specimens representing >80% of the living genera of vertebrates, providing broad coverage for exploration and research on all major groups of vertebrates. Contrast-enhanced scans will be generated to reveal soft tissues and organs for a majority of the living vertebrate families. This collection of digital imagery and three-dimensional volumes will be open for exploration, download, and use. These new media will provide unprecedented global access to valuable morphological data of specimens in US collections.oVert is developing best practices and guidelines for high-throughput CT-scanning, including efficient workflows, preferred resolutions, and archival formats that optimize the variety of downstream applications. Using the Integrated Digitized Biocollections (iDigBio) API, we have developed a workflow where people uploading media files to MorphoSource can search for and import metadata for specimens directly from iDigBio. Via a Rich Site Summary (RSS) feed from MorphoSource, Audubon Core data describing media files for a given scientific collection can be retrieved and integrated into institutional IPT and databases. Such data migration of large files requires attention to detail and the development of data workflows that ensure correct specimen mapping at all steps. The RSS feed from MorphoSource will also consolidate usage information for media files from specimens in each scientific collection for reporting. Additional goals of the project are to provide information vital to the creation of collection best practices for imaging permissions/copyright. A status report and update on best practices will be presented. 
    more » « less
  3. People are involved with the collection and curation of all biodiversity data, whether they are researchers, members of the public, taxonomists, conservationists, collection managers or wildlife managers. Knowing who those people are and connecting their biographical information to the biodiversity data they collect helps us contextualise their scientific work. We are particularly concerned with those people and communities involved in the collection and identification of biological specimens. People from herbaria and natural science museums have been collecting and preserving specimens from all over the world for more than 200 years. The problem is that many of these people are only known by unstandardized names written on specimen labels, often with only initials and without any biographical information. The process of identifying and linking individuals to their biographies enables us to improve the quality of the data held by collections while also quantifying the contributions of the often underappreciated people who collected and identified these specimens. This process improves our understanding of the history of collecting, and addresses current and future needs for maintaining the provenance of specimens so as to comply with national and international practices and regulations. In this talk we will outline the steps that collection managers, data scientists, curators, software engineers, and collectors can take to work towards fully disambiguated collections. With examples, we can show how they can use these data to help them in their work, in the evaluation of their collections, and in measuring the impact of individuals and organisations, local to global. 
    more » « less
  4. Conference Title: 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) Conference Start Date: 2021, Sept. 27 Conference End Date: 2021, Sept. 30 Conference Location: Champaign, IL, USAMetadata are key descriptors of research data, particularly for researchers seeking to apply machine learning (ML) to the vast collections of digitized specimens. Unfortunately, the available metadata is often sparse and, at times, erroneous. Additionally, it is prohibitively expensive to address these limitations through traditional, manual means. This paper reports on research that applies machine-driven approaches to analyzing digitized fish images and extracting various important features from them. The digitized fish specimens are being analyzed as part of the Biology Guided Neural Networks (BGNN) initiative, which is developing a novel class of artificial neural networks using phylogenies and anatomy ontologies. Automatically generated metadata is crucial for identifying the high-quality images needed for the neural network's predictive analytics. Methods that combine ML and image informatics techniques allow us to rapidly enrich the existing metadata associated with the 7,244 images from the Illinois Natural History Survey (INHS) used in our study. Results show we can accurately generate many key metadata properties relevant to the BGNN project, as well as general image quality metrics (e.g. brightness and contrast). Results also show that we can accurately generate bounding boxes and segmentation masks for fish, which are needed for subsequent machine learning analyses. The automatic process outperforms humans in terms of time and accuracy, and provides a novel solution for leveraging digitized specimens in ML. This research demonstrates the ability of computational methods to enhance the digital library services associated with the tens of thousands of digitized specimens stored in open-access repositories worldwide. 
    more » « less
  5. Adam, N.; Neuhold, E.; Furuta, R. (Ed.)
    Metadata is a key data source for researchers seeking to apply machine learning (ML) to the vast collections of digitized biological specimens that can be found online. Unfortunately, the associated metadata is often sparse and, at times, erroneous. This paper extends previous research conducted with the Illinois Natural History Survey (INHS) collection (7244 specimen images) that uses computational approaches to analyze image quality, and then automatically generates 22 metadata properties representing the image quality and morphological features of the specimens. In the research reported here, we demonstrate the extension of our initial work to University of the Wisconsin Zoological Museum (UWZM) collection (4155 specimen images). Further, we enhance our computational methods in four ways: (1) augmenting the training set, (2) applying contrast enhancement, (3) upscaling small objects, and (4) refining our processing logic. Together these new methods improved our overall error rates from 4.6 to 1.1%. These enhancements also allowed us to compute an additional set of 17 image-based metadata properties. The new metadata properties provide supplemental features and information that may also be used to analyze and classify the fish specimens. Examples of these new features include convex area, eccentricity, perimeter, skew, etc. The newly refined process further outperforms humans in terms of time and labor cost, as well as accuracy, providing a novel solution for leveraging digitized specimens with ML. This research demonstrates the ability of computational methods to enhance the digital library services associated with the tens of thousands of digitized specimens stored in open-access repositories world-wide by generating accurate and valuable metadata for those repositories. 
    more » « less