skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Developing a vocabulary and ontology for modeling insect natural history data: example data, use cases, and competency questions
Insects are possibly the most taxonomically and ecologically diverse class of multicellular organisms on Earth. Consequently, they provide nearly unlimited opportunities to develop and test ecological and evolutionary hypotheses. Currently, however, large-scale studies of insect ecology, behavior, and trait evolution are impeded by the difficulty in obtaining and analyzing data derived from natural history observations of insects. These data are typically highly heterogeneous and widely scattered among many sources, which makes developing robust information systems to aggregate and disseminate them a significant challenge. As a step towards this goal, we report initial results of a new effort to develop a standardized vocabulary and ontology for insect natural history data. In particular, we describe a new database of representative insect natural history data derived from multiple sources (but focused on data from specimens in biological collections), an analysis of the abstract conceptual areas required for a comprehensive ontology of insect natural history data, and a database of use cases and competency questions to guide the development of data systems for insect natural history data. We also discuss data modeling and technology-related challenges that must be overcome to implement robust integration of insect natural history data.  more » « less
Award ID(s):
1612335
PAR ID:
10088924
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Biodiversity Data Journal
Volume:
7
ISSN:
1314-2828
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The spectacular radiation of insects has produced a stunning diversity of phenotypes. During the past 250 years, research on insect systematics has generated hundreds of terms for naming and comparing them. In its current form, this terminological diversity is presented in natural language and lacks formalization, which prohibits computer-assisted comparison using semantic web technologies. Here we propose a Model for Describing Cuticular Anatomical Structures (MoDCAS) which incorporates structural properties and positional relationships for standardized, consistent, and reproducible descriptions of arthropod phenotypes. We applied the MoDCAS framework in creating the ontology for the Anatomy of the Insect Skeleto-Muscular system (AISM). The AISM is the first general insect ontology that aims to cover all taxa by providing generalized, fully logical, and queryable, definitions for each term. It was built using the Ontology Development Kit (ODK), which maximizes interoperability with Uberon (Uberon multi-species anatomy ontology) and other basic ontologies, enhancing the integration of insect anatomy into the broader biological sciences. A template system for adding new terms, extending, and linking the AISM to additional anatomical, phenotypic, genetic, and chemical ontologies is also introduced. The AISM is proposed as the backbone for taxon-specific insect ontologies and has potential applications spanning systematic biology and biodiversity informatics, allowing users to (1) use controlled vocabularies and create semi-automated computer-parsable insect morphological descriptions; (2) integrate insect morphology into broader fields of research, including ontology-informed phylogenetic methods, logical homology hypothesis testing, evo-devo studies, and genotype to phenotype mapping; and (3) automate the extraction of morphological data from the literature, enabling the generation of large-scale phenomic data, by facilitating the production and testing of informatic tools able to extract, link, annotate, and process morphological data. This descriptive model and its ontological applications will allow for clear and semantically interoperable integration of arthropod phenotypes in biodiversity studies. 
    more » « less
  2. Galls are novel plant structures that develop in response to select biotic stressors. These structures, extended phenotypes of the inducer, usually serve to protect and feed the inducer or its progeny. This life history strategy has evolved dozens of times, and tens of thousands of species — including many bacteria, fungi, nematodes, mites and insects — are capable of manipulating plants in this way. The variation in gall phenotypes is extraordinary across species but usually predictable for each species of inducer. We introduce here a new ontology, GallOnt, that facilitates consistent descriptions and the semantic representation of and reasoning over plant gall phenotype data. GallOnt was largely developed from ontologies in the Open Biological and Biomedical Ontology (OBO) Foundry and stands to connect plant gall phenotypes to knowledge derived from model plant systems, including genotype-phenotype and agricultural research. We also introduce the idea of a new gall data standard — Minimum Information for the Description of Galls (MIDG version 0.1) — as a starting point for discussions regarding cecidology best practices. 
    more » « less
  3. Natalie Cooper (Ed.)
    1. Historical datasets can establish a critical baseline of plant–animal interactions for understanding contemporary interactions in the context of global change. Pollen is often incidentally preserved on animals in natural history collections. Techniques for removing pollen from insects have largely been developed for fresh insect specimens or historical specimens with large amounts of pollen on specialized structures. However, many key pollinating insects do not have these specialized structures and thus, there is a need for a method to extract pollen from these small and fragile insects. 2. Here, we propose a precision glycerine jelly swab tool to allow for the precise removal of pollen from old, small and fragile insect specimens. We use this tool to remove pollen from five families of insects collected in the late 1970s. Additionally, we compare our method with four previously published techniques for removing pollen from pinned contemporary specimens. 3. We show the functionality of the precision glycerine jelly swab for removing small quantities of pollen across insect families. We found that across the five methods, all removed pollen; yet, it was clear that some are better suited for fragile specimens. In particular, the traditional glycerine jelly swab and the precision glycerine jelly swabs both performed well for removing pollen from bee faces. The shaking wash resulted in specimen fracture and residue left behind, the ethanol rinses left setae matted, and the glycerol swabbing left residue on the specimen. Additionally, we present photographs documenting the effects of these methods on pinned honey bee specimens. 4. The precision glycerine jelly swab opens up opportunities to sample pollen from a variety of insects in natural history collections. These pollen samples can be incorporated into downstream analyses for pollen identification either via mi-croscopy or DNA sequencing, and the resulting plant–insect interaction data can establish historical baselines for contemporary comparison. Beyond our ap-plication of this method to pollen on insects, this precision glycerine jelly swab tool could be used to explore pollen placement specialization or to sample bryo-phyte, fungal and tree fern spores dispersing on animals. 
    more » « less
  4. Abstract Insects are the most ubiquitous and diverse group of eukaryotic organisms on Earth, forming a crucial link in terrestrial and freshwater food webs. They have recently become the subject of headlines because of observations of dramatic declines in some places. Although there are hundreds of long‐term insect monitoring programs, a global database for long‐term data on insect assemblages has so far remained unavailable. In order to facilitate synthetic analyses of insect abundance changes, we compiled a database of long‐term (≥10 yr) studies of assemblages of insects (many also including arachnids) in the terrestrial and freshwater realms. We searched the scientific literature and public repositories for data on insect and arachnid monitoring using standardized protocols over a time span of 10 yr or longer, with at least two sampling events. We focused on studies that presented or allowed calculation of total community abundance or biomass. We extracted data from tables, figures, and appendices, and, for data sets that provided raw data, we standardized trapping effort over space and time when necessary. For each site, we extracted provenance details (such as country, state, and continent) as well as information on protection status, land use, and climatic details from publicly available GIS sources. In all, the database contains 1,668 plot‐level time series sourced from 165 studies with samples collected between 1925 and 2018. Sixteen data sets provided here were previously unpublished. Studies were separated into those collected in the terrestrial realm (103 studies with a total of 1,053 plots) and those collected in the freshwater realm (62 studies with 615 plots). Most studies were from Europe (48%) and North America (29%), with 34% of the plots located in protected areas. The median monitoring time span was 19 yr, with 12 sampling years. The number of individuals was reported in 129 studies, the total biomass was reported in 13 studies, and both abundance and biomass were reported in 23 studies. This data set is published under a CC‐BY license, requiring attribution of the data source. Please cite this paper if the data are used in publications, and respect the licenses of the original sources when using (part of) their data as detailed in Metadata S1: Table 1. 
    more » « less
  5. Spatially explicit, fine-grained datasets describing historical urban extents are rarely available prior to the era of operational remote sensing. However, such data are necessary to better understand long-term urbanization and land development processes and for the assessment of coupled nature–human systems (e.g., the dynamics of the wildland–urban interface). Herein, we propose a framework that jointly uses remote-sensing-derived human settlement data (i.e., the Global Human Settlement Layer, GHSL) and scanned, georeferenced historical maps to automatically generate historical urban extents for the early 20th century. By applying unsupervised color space segmentation to the historical maps, spatially constrained to the urban extents derived from the GHSL, our approach generates historical settlement extents for seamless integration with the multi-temporal GHSL. We apply our method to study areas in countries across four continents, and evaluate our approach against historical building density estimates from the Historical Settlement Data Compilation for the US (HISDAC-US), and against urban area estimates from the History Database of the Global Environment (HYDE). Our results achieve Area-under-the-Curve values >0.9 when comparing to HISDAC-US and are largely in agreement with model-based urban areas from the HYDE database, demonstrating that the integration of remote-sensing-derived observations and historical cartographic data sources opens up new, promising avenues for assessing urbanization and long-term land cover change in countries where historical maps are available. 
    more » « less