Scientific collections have been built by people. For hundreds of years, people have collected, studied, identified, preserved, documented and curated collection specimens. Understanding who those people are is of interest to historians, but much more can be made of these data by other stakeholders once they have been linked to the people’s identities and their biographies. Knowing who people are helps us attribute work correctly, validate data and understand the scientific contribution of people and institutions. We can evaluate the work they have done, the interests they have, the places they have worked and what they have created from the specimens they have collected. The problem is that all we know about most of the people associated with collections are their names written on specimens. Disambiguating these people is the challenge that this paper addresses. Disambiguation of people often proves difficult in isolation and can result in staff or researchers independently trying to determine the identity of specific individuals over and over again. By sharing biographical data and building an open, collectively maintained dataset with shared knowledge, expertise and resources, it is possible to collectively deduce the identities of individuals, aggregate biographical information for each person, reduce duplication of effort and share the information locally and globally. The authors of this paper aspire to disambiguate all person names efficiently and fully in all their variations across the entirety of the biological sciences, starting with collections. Towards that vision, this paper has three key aims: to improve the linking, validation, enhancement and valorisation of person-related information within and between collections, databases and publications; to suggest good practice for identifying people involved in biological collections; and to promote coordination amongst all stakeholders, including individuals, natural history collections, institutions, learned societies, government agencies and data aggregators.
more »
« less
Collections do not have to Remain Ambiguous Forever: Seven steps to getting the correct people into your data
People are involved with the collection and curation of all biodiversity data, whether they are researchers, members of the public, taxonomists, conservationists, collection managers or wildlife managers. Knowing who those people are and connecting their biographical information to the biodiversity data they collect helps us contextualise their scientific work. We are particularly concerned with those people and communities involved in the collection and identification of biological specimens. People from herbaria and natural science museums have been collecting and preserving specimens from all over the world for more than 200 years. The problem is that many of these people are only known by unstandardized names written on specimen labels, often with only initials and without any biographical information. The process of identifying and linking individuals to their biographies enables us to improve the quality of the data held by collections while also quantifying the contributions of the often underappreciated people who collected and identified these specimens. This process improves our understanding of the history of collecting, and addresses current and future needs for maintaining the provenance of specimens so as to comply with national and international practices and regulations. In this talk we will outline the steps that collection managers, data scientists, curators, software engineers, and collectors can take to work towards fully disambiguated collections. With examples, we can show how they can use these data to help them in their work, in the evaluation of their collections, and in measuring the impact of individuals and organisations, local to global.
more »
« less
- Award ID(s):
- 2033973
- PAR ID:
- 10377060
- Date Published:
- Journal Name:
- Biodiversity Information Science and Standards
- Volume:
- 6
- ISSN:
- 2535-0897
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Biodiversity is the word used to describe the rich variety of life on Earth. Right now, Earth’s biodiversity is threatened. Museums, zoos, and other kinds of natural history collections help to protect biodiversity. One way they do this is by helping researchers study life on Earth. Another way is by teaching people, through exhibits and events. Natural history collections face many challenges. One challenge is getting enough money to stay open. Another is finding new space as collections grow. Finally, some people who want to use and learn from collections cannot access them because they are not nearby. Museum collections are now putting information on the internet, so that many people can access and use it. We can all help natural history collections to continue protecting Earth’s biodiversity by visiting them, volunteering, and donating specimens or other resources.more » « less
-
Insect research collections often include outreach drawers displaying specimens to enhance public awareness and access to scientific knowledge at various events. Despite their educational value, there is limited understanding of how these drawers are designed, used, or evaluated for quality. As a first step towards understanding these aspects, we surveyed members of the community who use insect drawers for public outreach. Survey results indicate that curators and collection managers consider outreach drawers important and use them widely at events, though they are rarely assessed beyond aesthetics and/or anecdotal audience feedback. The number and thematic scope of these drawers vary significantly among institutions, from as few as 3 to more than 50, and covering topics from collection history to pollinator conservation. However, few institutions display these collections online, limiting access to in-person events. Their maintenance and development are also often constrained by limited funding and staff availability. To guide decisions and efforts to enhance the educational impact and accessibility of outreach drawers, we introduce a quick-assessment tool based on five criteria: information, relevance, aesthetics, potential for engagement and inspiration. The next step is to apply appropriate tools to measure public engagement with these displays.more » « less
-
null (Ed.)A wealth of information about how parasites interact with their hosts already exists in collections, scientific publications, specialized databases, and grey literature. The US National Science Foundation-funded Terrestrial Parasite Tracker Thematic Collection Network (TPT) project began in 2019 to help build a comprehensive picture of arthropod ectoparasites including the evolution of these parasite-host biotic associations, distributions, and the ecological interactions of disease vectors. TPT is a network of biodiversity collections whose data can assist scientists, educators, land managers, and policymakers to better understand the complex relationship between hosts and parasites including emergent properties that may explain the causes and frequency of human and wildlife pathogens. TPT member collections make their association information easier to access via Global Biotic Interactions (GloBI, Poelen et al. 2014), which is periodically archived through Zenodo to track progress in the TPT project. TPT leverages GloBI's ability to index biotic associations from specimen occurrence records that come from existing management systems (e.g., Arctos, Symbiota, EMu, Excel, MS Access) to avoid having to completely rework existing, or build new, cyber-infrastructures before collections can share data. TPT-affiliated collection managers use collection-specific translation tables to connect their verbatim (or original) terms used to describe associations (e.g., "ex", "found on", "host") to their interpreted, machine-readable terms in the OBO Relations Ontology (RO). These interpreted terms enable searches across previously siloed association record sets, while the original verbatim values remain accessible to help retain provenance and allow for interpretation improvements. TPT is an ambitious project, with the goal to database label data from over 1.2 million specimens of arthropod parasites of vertebrates coming from 22 collections across North America. In the first year of the project, the TPT collections created over 73,700 new records and 41,984 images. In addition, 17 TPT data providers and three other collaborators shared datasets that are now indexed by GloBI, visible on the TPT GloBI project page. These datasets came from collection specimen occurrence records and literature sources. Two TPT data archives that capture and preserve the changes in the data coming from TPT to GloBI were published through Zenodo (Poelen et al. 2020a, Poelen et al. 2020b). The archives document the changes in how data are shared by collections including the biotic association data format and quantity of data captured. The Poelen et al. 2020b report included all TPT collections and biotic interactions from Arctos collections in VertNet and the Symbiota Collection of Arthropods Network (SCAN). The total number of interactions included in this report was 376,671 records (500,000 interactions is the overall goal for TPT). In addition, close coordination with TPT collection data managers including many one-on-one conversations, a workshop, and a webinar (Sullivan et al. 2020) was conducted to help guide the data capture of biotic associations. GloBI is an effective tool to help integrate biotic association data coming from occurrence records into an openly accessible, global, linked view of existing species interaction records. The results gleaned from the TPT workshop and Zenodo data archives demonstrate that minimizing changes to existing workflows allow for custom interpretation of collection-specific interaction terms. In addition, including collection data managers in the development of the interaction term vocabularies is an important part of the process that may improve data sharing and the overall downstream data quality.more » « less
-
When scientists study plants, they often collect, preserve, and store parts of the plants in a big collection called an herbarium. These plant specimens serve as proof that a species was growing in a certain place at a certain time. Herbaria (“herbaria” is the plural of herbarium) are where scientists describe new plant species and study how different species are related. Herbaria also contain lots of information about where certain plant species grow, what type of habitats species like, and at what time of year plants bloom and make fruits. Finally, herbaria are powerful tools for helping us understand how plants are affected by disturbances like habitat destruction and climate change. For all of these reasons, herbaria allow us to better understand and protect plant species all over the world. To continue benefitting from herbaria, we need to keep collecting plants and make these collections accessible to the world.more » « less
An official website of the United States government

