skip to main content

Title: Species traits and observer behaviors that bias data assimilation and how to accommodate them

Datasets that monitor biodiversity capture information differently depending on their design, which influences observer behavior and can lead to biases across observations and species. Combining different datasets can improve our ability to identify and understand threats to biodiversity, but this requires an understanding of the observation bias in each. Two datasets widely used to monitor bird populations exemplify these general concerns: eBird is a citizen science project with high spatiotemporal resolution but variation in distribution, effort, and observers, whereas the Breeding Bird Survey (BBS) is a structured survey of specific locations over time. Analyses using these two datasets can identify contradictory population trends. To understand these discrepancies and facilitate data fusion, we quantify species‐level reporting differences across eBird and the BBS in three regions across the United States by jointly modeling bird abundances using data from both datasets. First, we fit a joint Species Distribution Model that accounts for environmental conditions and effort to identify reporting differences across the datasets. We then examine how these differences in reporting are related to species traits. Finally, we analyze species reported to one dataset but not the other and determine whether traits differ between reported and unreported species. We find that most species are reported more in the BBS than eBird. Specifically, we find that compared to eBird, BBS observers tend to report higher counts of common species and species that are usually detected by sound. We also find that species associated with water are reported less in the BBS. Species typically identified by sound are reported more at sunrise than later in the morning. Our results quantify reporting differences in eBird and the BBS to enhance our understanding of how each captures information and how they should be used. The reporting rates we identify can also be incorporated into observation models through detectability or effort to improve analyses across species and datasets. The method demonstrated here can be used to compare reporting rates across any two or more datasets to examine biases.

more » « less
Author(s) / Creator(s):
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Ecological Applications
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Silva, Daniel de (Ed.)
    Biodiversity loss is a global ecological crisis that is both a driver of and response to environmental change. Understanding the connections between species declines and other components of human-natural systems extends across the physical, life, and social sciences. From an analysis perspective, this requires integration of data from different scientific domains, which often have heterogeneous scales and resolutions. Community science projects such as eBird may help to fill spatiotemporal gaps and enhance the resolution of standardized biological surveys. Comparisons between eBird and the more comprehensive North American Breeding Bird Survey (BBS) have found these datasets can produce consistent multi-year abundance trends for bird populations at national and regional scales. Here we investigate the reliability of these datasets for estimating patterns at finer resolutions, inter-annual changes in abundance within town boundaries. Using a case study of 14 focal species within Massachusetts, we calculated four indices of annual relative abundance using eBird and BBS datasets, including two different modeling approaches within each dataset. We compared the correspondence between these indices in terms of multi-year trends, annual estimates, and inter-annual changes in estimates at the state and town-level. We found correspondence between eBird and BBS multi-year trends, but this was not consistent across all species and diminished at finer, inter-annual temporal resolutions. We further show that standardizing modeling approaches can increase index reliability even between datasets at coarser temporal resolutions. Our results indicate that multiple datasets and modeling methods should be considered when estimating species population dynamics at finer temporal resolutions, but standardizing modeling approaches may improve estimate correspondence between abundance datasets. In addition, reliability of these indices at finer spatial scales may depend on habitat composition, which can impact survey accuracy. 
    more » « less
  2. null (Ed.)
    The growth of biodiversity data sets generated by citizen scientists continues to accelerate. The availability of such data has greatly expanded the scale of questions researchers can address. Yet, error, bias, and noise continue to be serious concerns for analysts, particularly when data being contributed to these giant online data sets are difficult to verify. Counts of birds contributed to eBird, the world’s largest biodiversity online database, present a potentially useful resource for tracking trends over time and space in species’ abundances. We quantified counting accuracy in a sample of 1,406 eBird checklists by comparing numbers contributed by birders (N = 246) who visited a popular birding location in Oregon, USA, with numbers generated by a professional ornithologist engaged in a long-term study creating benchmark (reference) measurements of daily bird counts. We focused on waterbirds, which are easily visible at this site. We evaluated potential predictors of count differences, including characteristics of contributed checklists, of each species, and of time of day and year. Count differences were biased toward undercounts, with more than 75% of counts being below the daily benchmark value. Median count discrepancies were −29.1% (range: 0 to −42.8%; N = 20 species). Model sets revealed an important influence of each species’ reference count, which varied seasonally as waterbird numbers fluctuated, and of percent of species known to be present each day that were included on each checklist. That is, checklists indicating a more thorough survey of the species richness at the site also had, on average, smaller count differences. However, even on checklists with the most thorough species lists, counts were biased low and exceptionally variable in their accuracy. To improve utility of such bird count data, we suggest three strategies to pursue in the future. (1) Assess additional options for analytically determining how to select checklists that include less biased count data, as well as exploring options for correcting bias during the analysis stage. (2) Add options for users to provide additional information that helps analysts choose checklists, such as an option for users to tag checklists where they focused on obtaining accurate counts. (3) Explore opportunities to effectively calibrate citizen-science bird count data by establishing a formalized network of marquis sites where dedicated observers regularly contribute carefully collected benchmark data. 
    more » « less
  3. Abstract

    Aircraft collisions with birds span the entire history of human aviation, including fatal collisions during some of the first powered human flights. Much effort has been expended to reduce such collisions, but increased knowledge about bird movements and species occurrence could dramatically improve decision support and proactive measures to reduce them. Migratory movements of birds pose a unique, often overlooked, threat to aviation that is particularly difficult for individual airports to monitor and predict the occurrence of birds vary extensively in space and time at the local scales of airport responses.

    We use two publicly available datasets, radar data from the US NEXRAD network characterizing migration movements and eBird data collected by citizen scientists to map bird movements and species composition with low human effort expenditures but high temporal and spatial resolution relative to other large‐scale bird survey methods. As a test case, we compare results from weather radar distributions and eBird species composition with detailed bird strike records from three major New York airports.

    We show that weather radar‐based estimates of migration intensity can accurately predict the probability of bird strikes, with 80% of the variation in bird strikes across the year explained by the average amount of migratory movements captured on weather radar. We also show that eBird‐based estimates of species occurrence can, using species’ body mass and flocking propensity, accurately predict when most damaging strikes occur.

    Synthesis and applications. By better understanding when and where different bird species occur, airports across the world can predict seasonal periods of collision risks with greater temporal and spatial resolution; such predictions include potential to predict when the most severe and damaging strikes may occur. Our results highlight the power of federating datasets with bird movement and distribution data for developing better and more taxonomically and ecologically tuned models of likelihood of strikes occurring and severity of strikes.

    more » « less
  4. Abstract Aim

    Understanding and addressing the global biodiversity crisis requires ecological information compiled continuously from across the globe. Data from citizen science initiatives are useful for quantifying species' ecological niches and geographical distributions but can be difficult to apply towards biodiversity monitoring. The presence of fixed geographical locations reduces the opportunistic nature of citizen science data, allowing for more reliable and nuanced trend estimation. The eBird citizen‐science program contains predefined locations whose bird assemblages are sampled across years (‘hotspots’). For hotspots to function as a biodiversity monitoring resource, issues related to data coverage, biases, and trends need to be addressed.




    We estimated the survey completeness of species richness at 300,500 eBird hotspots during 2002–2022. We documented sampling biases at eBird hotspot and non‐hotspot locations during 2022 based on protection status, temperature, precipitation, and landcover.


    A total of 10,410 bird species (ca. 96.9% of total) were recorded at hotspots. The number of hotspots, checklists, and participants and the quality of species richness estimates increased worldwide with the Nearctic containing the strongest and most consistent trends. Compared to non‐hotspots, hotspots oversampled areas with higher protection status. Hotspots and non‐hotspots oversampled warmer and wetter locations in the Antarctic, Nearctic, and Palearctic, and cooler locations in the Afrotropics, Australasia, and the Neotropics. Hotspots and especially non‐hotspots oversampled urban areas. Hotspots and non‐hotspots undersampled shrublands in Australasia. Hotspots and especially non‐hotspots undersampled forests in the Afrotropics, Indomalaya, Neotropics, and Oceania.

    Main Conclusions

    Hotspots have captured a large component of the world's avian diversity but have done so inconsistently across space and time. Data quantity and quality are increasing in many regions, but the presence of regionally specific sampling biases and spatial uncertainty in hotspot locations should be addressed when applying the data.

    more » « less
  5. The conversion of forest to agriculture is considered one of the greatest threats to avian biodiversity, yet how species respond to habitat modification throughout the annual cycle remains unknown. We examined whether forest bird associations with agricultural habitats vary throughout the year, and if species traits influence these relationships. Using data from the eBird community‐science program, we investigated associations between agriculturally‐modified land cover and the occurrence of 238 forest bird species based on three sets of avian traits: migratory strategy, dietary guild, and foraging strategy. We found that the influence of agriculturally‐modified land cover on species distributions varied widely across periods and trait groups but highlighting several broad findings. First, migratory species showed strong seasonal differences in their response to agricultural land cover while resident species did not. Second, there was a migratory strategy by season interaction; Neotropical migrants were most negatively influenced by agricultural land cover during the breeding period while short‐distance migrants were most negatively influenced during the non‐breeding period. Third, regardless of season, some dietary (e.g. insectivores) and foraging guilds (e.g. bark foragers) consistently responded more negatively to agricultural land cover than others (e.g. omnivores and ground foragers, respectively). Fourth, there were greater differences among dietary guilds in their responses to agricultural land cover during the breeding period than during the non‐breeding period, perhaps reflecting how different habitat and ecological requirements enhance the susceptibility of some guilds during reproduction. These results suggest that management efforts across the annual cycle may be oversimplified and thus ineffective when based on broad ecological generalisations that are static in space and time.

    more » « less