Generating regional checklists for insects is frequently based on combining data sources ranging from literature and expert assertions that merely imply the existence of an occurrence to aggregated, standard-compliant data of uniquely identified specimens. The increasing diversity of data sources also means that checklist authors are faced with new responsibilities, effectively acting as filterers to select and utilize an expert-validated subset of all available data. Authors are also faced with the technical obstacle to bring more occurrences into Darwin Core-based data aggregation, even if the corresponding specimens belong to external institutions. We illustrate these issues based on a partial update of the Kimsey et al. 2017 checklist of darkling beetles - Tenebrionidae sec. Bousquet et al. 2018 - inhabiting the Algodones Dunes of California. Our update entails 54 species-level concepts for this group and region, of which 31 concepts were found to be represented in three specimen-data aggregator portals, based on our interpretations of the aggregators' data. We reassess the distributions and biogeographic affinities of these species, focusing on taxa that are precinctive (highly geographically restricted) to the Lower Colorado River Valley in the context of recent dune formation from the Colorado River. Throughout, we apply taxonomic concept labels (taxonomic name according to source) to contextualize preferred name usages, but also show that the identification data of aggregated occurrences are very rarely well-contextualized or annotated. Doing so is a pre-requisite for publishing open, dynamic checklist versions that finely accredit incremental expert efforts spent to improve the quality of checklists and aggregated occurrence data.
more »
« less
Why Toxicocalamus longhagen Roberts, Iova & Austin, 2022 (Serpentes, Elapidae) is a taxonomic nomen dubium
Roberts et al. (2022) presented a taxonomic decision, in which they proposed the species name longhagen for a single, poorly preserved specimen of elapid New Guinean snake in the species assemblage known as the Toxicocalamus loriae Group. Geographically widespread populations in this species group had long been united under a single name even though some character variation had been noted, and only a thorough morphological study by Kraus et al. (2022), published shortly after the description of T. longhagen, confirmed additional species-level diversity and the detail of character analysis needed to differentiate species in this group. Their work made clear that only examination of many specimens would allow an assessment of interspecific variation and species boundaries, and this had been explained to the authors of the Roberts et al. paper ahead of their manuscript submission. The authors of the Kraus et al. paper had examined the specimen used to diagnose T. longhagen, as well as a series of similar specimens, and found it impossible to make a reliable species-level determination. Our detailed evaluation of the taxon longhagen reveals that it is insufficiently differentiated from the now-known species of the T. loriae Group, that it cannot confidently be assigned to any of these species, and that none of the existing specimens of snakes in this group can be assigned to T. longhagen. It follows that T. longhagen as currently defined is a taxonomic nomen dubium. It will retain this status until such time when additional data or additional material can lead to a resolution of its taxonomy.
more »
« less
- Award ID(s):
- 2230919
- PAR ID:
- 10511536
- Publisher / Repository:
- Bionomia
- Date Published:
- Journal Name:
- Bionomina
- Volume:
- 32
- Issue:
- 1
- ISSN:
- 1179-7649
- Page Range / eLocation ID:
- 41 to 51
- Subject(s) / Keyword(s):
- Code, taxonomic status, New Guinea, nomenclature, mihi itch, worm-eating snake, herpetology
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)“What is crucial for your ability to communicate with me… pivots on the recipient’s capacity to interpret—to make good inferential sense of the meanings that the declarer is able to send” (Rescher 2000, p148). Conventional approaches to reconciling taxonomic information in biodiversity databases have been based on string matching for unique taxonomic name combinations (Kindt 2020, Norman et al. 2020). However, in their original context, these names pertain to specific usages or taxonomic concepts, which can subsequently vary for the same name as applied by different authors. Name-based synonym matching is a helpful first step (Guala 2016, Correia et al. 2018), but may still leave considerable ambiguity regarding proper usage (Fig. 1). Therefore, developing "taxonomic intelligence" is the bioinformatic challenge to adequately represent, and subsequently propagate, this complex name/usage interaction across trusted biodiversity data networks. How do we ensure that senders and recipients of biodiversity data not only can share messages but do so with “good inferential sense” of their respective meanings? Key obstacles have involved dealing with the complexity of taxonomic name/usage modifications through time, both in terms of accounting for and digitally representing the long histories of taxonomic change in most lineages. An important critique of proposals to use name-to-usage relationships for data aggregation has been the difficulty of scaling them up to reach comprehensive coverage, in contrast to name-based global taxonomic hierarchies (Bisby 2011). The Linnaean system of nomenclature has some unfortunate design limitations in this regard, in that taxonomic names are not unique identifiers, their meanings may change over time, and the names as a string of characters do not encode their proper usage, i.e., the name “Genus species” does not specify a source defining how to use the name correctly (Remsen 2016, Sterner and Franz 2017). In practice, many people provide taxonomic names in their datasets or publications but not a source specifying a usage. The information needed to map the relationships between names and usages in taxonomic monographs or revisions is typically not presented it in a machine-readable format. New approaches are making progress on these obstacles. Theoretical advances in the representation of taxonomic intelligence have made it increasingly possible to implement efficient querying and reasoning methods on name-usage relationships (Chen et al. 2014, Chawuthai et al. 2016, Franz et al. 2015). Perhaps most importantly, growing efforts to produce name-usage mappings on a medium scale by data providers and taxonomic authorities suggest an all-or-nothing approach is not required. Multiple high-profile biodiversity databases have implemented internal tools for explicitly tracking conflicting or dynamic taxonomic classifications, including eBird using concept relationships from AviBase (Lepage et al. 2014); NatureServe in its Biotics database; iNaturalist using its taxon framework (Loarie 2020); and the UNITE database for fungi (Nilsson et al. 2019). Other ongoing projects incorporating taxonomic intelligence include the Flora of Alaska (Flora of Alaska 2020), the Mammal Diversity Database (Mammal Diversity Database 2020) and PollardBase for butterfly population monitoring (Campbell et al. 2020).more » « less
-
It takes great effort to manually or semi-automatically convert free-text phenotype narratives (e.g., morphological descriptions in taxonomic works) to a computable format before they can be used in large-scale analyses. We argue that neither a manual curation approach nor an information extraction approach based on machine learning is a sustainable solution to produce computable phenotypic data that are FAIR (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016). This is because these approaches do not scale to all biodiversity, and they do not stop the publication of free-text phenotypes that would need post-publication curation. In addition, both manual and machine learning approaches face great challenges: the problem of inter-curator variation (curators interpret/convert a phenotype differently from each other) in manual curation, and keywords to ontology concept translation in automated information extraction, make it difficult for either approach to produce data that are truly FAIR. Our empirical studies show that inter-curator variation in translating phenotype characters to Entity-Quality statements (Mabee et al. 2007) is as high as 40% even within a single project. With this level of variation, curated data integrated from multiple curation projects may still not be FAIR. The key causes of this variation have been identified as semantic vagueness in original phenotype descriptions and difficulties in using standardized vocabularies (ontologies). We argue that the authors describing characters are the key to the solution. Given the right tools and appropriate attribution, the authors should be in charge of developing a project's semantics and ontology. This will speed up ontology development and improve the semantic clarity of the descriptions from the moment of publication. In this presentation, we will introduce the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists, which consists of three components: a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. a web-based, ontology-aware software application called 'Character Recorder,' which features a spreadsheet as the data entry platform and provides authors with the flexibility of using their preferred terminology in recording characters for a set of specimens (this application also facilitates semantic clarity and consistency across species descriptions); a set of services that produce RDF graph data, collects terms added by authors, detects potential conflicts between terms, dispatches conflicts to the third component and updates the ontology with resolutions; and an Android mobile application, 'Conflict Resolver,' which displays ontological conflicts and accepts solutions proposed by multiple experts. Fig. 1 shows the system diagram of the platform. The presentation will consist of: a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. a report on the findings from a recent survey of 90+ participants on the need for a tool like Character Recorder; a methods section that describes how we provide semantics to an existing vocabulary of quantitative characters through a set of properties that explain where and how a measurement (e.g., length of perigynium beak) is taken. We also report on how a custom color palette of RGB values obtained from real specimens or high-quality specimen images, can be used to help authors choose standardized color descriptions for plant specimens; and a software demonstration, where we show how Character Recorder and Conflict Resolver can work together to construct both human-readable descriptions and RDF graphs using morphological data derived from species in the plant genus Carex (sedges). The key difference of this system from other ontology-aware systems is that authors can directly add needed terms to the ontology as they wish and can update their data according to ontology updates. The software modules currently incorporated in Character Recorder and Conflict Resolver have undergone formal usability studies. We are actively recruiting Carex experts to participate in a 3-day usability study of the entire system of the Platform for Author-Driven Computable Data and Ontology Production for Taxonomists. Participants will use the platform to record 100 characters about one Carex species. In addition to usability data, we will collect the terms that participants submit to the underlying ontology and the data related to conflict resolution. Such data allow us to examine the types and the quantities of logical conflicts that may result from the terms added by the users and to use Discrete Event Simulation models to understand if and how term additions and conflict resolutions converge. We look forward to a discussion on how the tools (Character Recorder is online at http://shark.sbs.arizona.edu/chrecorder/public) described in our presentation can contribute to producing and publishing FAIR data in taxonomic studies.more » « less
-
The subspecies rank has been widely applied by taxonomists to capture infraspecific variation within the Linnaean classification system. Many subspecies described throughout the 20th century were recognised largely based on perceived variation in single morphological characters yet have since been found not to correspond to separately evolving population lineages, thus requiring synonymy or elevation to full species under lineage-based views of species. These modern lineage-based taxonomic resolutions have resulted from a combination of new molecular genetic techniques, improved geographical sampling of specimens, and more sophisticated analyses of morphological variation (e.g., statistical assessments rather than solely univariate descriptive ones). Here, we revisit the current taxonomic arrangement of species-level and subspecific taxa in the Lerista microtis (Gray) group, which is distributed along a narrow ~2000 km strip on the southern coast of Australia. From specimens of the L. microtis group, an additional species (Lerista arenicola) and two additional subspecies (L. m. intermedia and L. m. schwaneri) were described. We collected data on mensural, meristic, and colour pattern characters to explore morpho-spatial relationships among these taxa. Although our morphological analyses revealed some distinctiveness among specimens from locations assigned to each taxon, this variation is continuous along Australia’s southern coastline, assuming the form of a geographic cline rather than discrete forms. For many characters, however, spatial patterns were inconsistent with the original descriptions, particularly of the subspecies. Moreover, analysis of genome wide restriction-associated DNA loci revealed multiple instances of paraphyly among taxa, with phylogenetic clustering of specimens assigned to distinct species and subspecies. These emerging patterns provide no support for L. arenicola as a species evolving separately from L. microtis. Additionally, our findings challenge the presumed distinctiveness and coherence of the three subspecies of L. microtis. We thus synonymise L. arenicola and the L. microtis subspecies with L. microtis and provide a redescription of a single yet morphologically variable species—an arrangement that best reflects evolutionary history and the continuous nature of morphological variation across space.more » « less
-
Melanesian blindsnakes of the genus Gerrhopilus have been little collected or researched. I examined specimens assigned in museums to Gerrhopilus inornatus and found considerable morphological diversity among them that indicates the presence of multiple species. I redescribe G. inornatus (Boulenger) based on the holotype and one additional specimen, and I describe six new species among specimens currently subsumed under that name from Papua New Guinea: Gerrhopilus flavinotatus sp. nov., Gerrhopilus lorealis sp. nov., Gerrhopilus papuanorum sp. nov., Gerrhopilus polyadenus sp. nov., Gerrhopilus slapcinskyi sp. nov., and Gerrhopilus wallachi sp. nov. Each species is currently known from only 1–3 specimens, and all but two are known only from single localities. In addition to traditional information on scale counts, habitus, and color patterns, I found the numbers and distributions of epidermal glands among the head shields to be especially useful for discriminating among species. The number of recognized Melanesian Gerrhopilus has increased tremendously in recent years, but the region has been poorly sampled for these snakes, and it is to be expected that additional species will be identified at such time as surveys can more effectively target these cryptic snakes.more » « less
An official website of the United States government

