skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 28, 2026

Title: MIReVTD, a Minimum Information Standard for Reporting Vector Trait Data
Abstract Vector-borne diseases pose a persistent and increasing challenge to human, animal, and agricultural systems globally. Mathematical modeling frameworks incorporating vector trait responses are powerful tools to assess risk and predict vector-borne disease impacts. Developing these frameworks and the reliability of their predictions hinge on the availability of experimentally derived vector trait data for model parameterization and inference of the biological mechanisms underpinning transmission. Trait experiments have generated data for many known and potential vector species, but the terminology used across studies is inconsistent, and accompanying publications may share data with insufficient detail for reuse or synthesis. The lack of data standardization can lead to information loss and prohibits analytical comprehensiveness. Here, we present MIReVTD, a Minimum Information standard for Reporting Vector Trait Data. Our reporting checklist balances completeness and labor- intensiveness with the goal of making these important experimental data easier to find and reuse, without onerous effort for scientists generating the data. To illustrate the standard, we provide an example reproducing results from anAedes aegyptimosquito study.  more » « less
Award ID(s):
2016265
PAR ID:
10603978
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
bioRxiv
Date Published:
Format(s):
Medium: X
Institution:
bioRxiv
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The growing threat of vector-borne diseases, highlighted by recent epidemics, has prompted increased focus on the fundamental biology of vector-virus interactions. To this end, experiments are often the most reliable way to measure vector competence (the potential for arthropod vectors to transmit certain pathogens). Data from these experiments are critical to understand outbreak risk, but – despite having been collected and reported for a large range of vector-pathogen combinations – terminology is inconsistent, records are scattered across studies, and the accompanying publications often share data with insufficient detail for reuse or synthesis. Here, we present a minimum data and metadata standard for reporting the results of vector competence experiments. Our reporting checklist strikes a balance between completeness and labor-intensiveness, with the goal of making these important experimental data easier to find and reuse in the future, without much added effort for the scientists generating the data. To illustrate the standard, we provide an example that reproduces results from a study ofAedes aegyptivector competence for Zika virus. 
    more » « less
  2. Haldorai, Anandakumar (Ed.)
    Darwin Core, the data standard used for sharing modern biodiversity and paleodiversity occurrence records, has previously lacked proper mechanisms for reporting what is known about the estimated age range of specimens from deep time. This has led to data providers putting these data in fields where they cannot easily be found by users, which impedes the reuse and improvement of these data by other researchers. Here we describe the development of the Chronometric Age Extension to Darwin Core, a ratified, community-developed extension that enables the reporting of ages of specimens from deeper time and the evidence supporting these estimates. The extension standardizes reporting about the methods or assays used to determine an age and other critical information like uncertainty. It gives data providers flexibility about the level of detail reported, focusing on the minimum information needed for reuse while still allowing for significant detail if providers have it. Providing a standardized format for reporting these data will make them easier to find and search and enable researchers to pinpoint specimens of interest for data improvement or accumulate more data for broad temporal studies. The Chronometric Age Extension was also the first community-managed vocabulary to undergo the new Biodiversity Informatics Standards (TDWG) review and ratification process, thus providing a blueprint for future Darwin Core extension development. 
    more » « less
  3. György Barabás (Ed.)
    Predicting how climate warming affects vector borne diseases is a key research priority. The prevailing approach uses the basic reproductive number (R0) to predict warming effects. However,R0is derived under assumptions of stationary thermal environments; using it to predict disease spread in non-stationary environments could lead to erroneous predictions. Here, we develop a trait-based mathematical model that can predict disease spread and prevalence for any vector borne disease under any type of non-stationary environment. We parameterize the model with trait response data for the Malaria vector and pathogen to test the latest IPCC predictions on warmer-than-average winters and hotter-than-average summers. We report three key findings. First, theR0formulation commonly used to investigate warming effects on disease spread violates the assumptions underlying its derivation as the dominant eigenvalue of a linearized host-vector model. As a result, it overestimates disease spread in cooler environments and underestimates it in warmer environments, proving its predictions to be unreliable even in a constant thermal environment. Second, hotter-than-average summers both narrow the thermal limits for disease prevalence, and reduce prevalence within those limits, to a much greater degree than warmer-than-average winters, highlighting the importance of hot extremes in driving disease burden. Third, while warming reduces infected vector populations through the compounding effects of adult mortality, and infected host populations through the interactive effects of mortality and transmission, uninfected vector populations prove surprisingly robust to warming. This suggests that ecological predictions of warming-induced reductions in disease burden should be tempered by the evolutionary possibility of vector adaptation to both cooler and warmer climates. 
    more » « less
  4. Faraji, Ary (Ed.)
    Abstract A growing body of information on vector-borne diseases has arisen as increasing research focus has been directed towards the need for anticipating risk, optimizing surveillance, and understanding the fundamental biology of vector-borne diseases to direct control and mitigation efforts. The scope and scale of this information, in the form of data, comprising database efforts, data storage, and serving approaches, means that it is distributed across many formats and data types. Data ranges from collections records to molecular characterization, geospatial data to interactions of vectors and traits, infection experiments to field trials. New initiatives arise, often spanning the effort traditionally siloed in specific research disciplines, and other efforts wane, perhaps in response to funding declines, different research directions, or lack of sustained interest. Thusly, the world of vector data – the Vector Data Ecosystem – can become unclear in scope, and the flows of data through these various efforts can become stymied by obsolescence, or simply by gaps in access and interoperability. As increasing attention is paid to creating FAIR (Findable Accessible Interoperable, and Reusable) data, simply characterizing what is ‘out there’, and how these existing data aggregation and collection efforts interact, or interoperate with each other, is a useful exercise. This study presents a snapshot of current vector data efforts, reporting on level of accessibility, and commenting on interoperability using an illustration to track a specimen through the data ecosystem to understand where it occurs for the database efforts anticipated to describe it (or parts of its extended specimen data). 
    more » « less
  5. Abstract PremisePlant trait data are essential for quantifying biodiversity and function across Earth, but these data are challenging to acquire for large studies. Diverse strategies are needed, including the liberation of heritage data locked within specialist literature such as floras and taxonomic monographs. Here we report FloraTraiter, a novel approach using rule‐based natural language processing (NLP) to parse computable trait data from biodiversity literature. MethodsFloraTraiter was implemented through collaborative work between programmers and botanical experts and customized for both online floras and scanned literature. We report a strategy spanning optical character recognition, recognition of taxa, iterative building of traits, and establishing linkages among all of these, as well as curational tools and code for turning these results into standard morphological matrices. ResultsOver 95% of treatment content was successfully parsed for traits with <1% error. Data for more than 700 taxa are reported, including a demonstration of common downstream uses. ConclusionsWe identify strategies, applications, tips, and challenges that we hope will facilitate future similar efforts to produce large open‐source trait data sets for broad community reuse. Largely automated tools like FloraTraiter will be an important addition to the toolkit for assembling trait data at scale. 
    more » « less