skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

This content will become publicly available on October 10, 2024

Title: Leveraging the strengths of citizen science and structured surveys to achieve scalable inference on population size

Population size is a key metric for management and policy decisions, yet wildlife monitoring programmes are often limited by the spatial and temporal scope of surveys. In these cases, citizen science data may provide complementary information at higher resolution and greater extent.

We present a case study demonstrating how data from the eBird citizen science programme can be combined with regional monitoring efforts by the US Fish and Wildlife Service to produce high‐resolution estimates of golden eagle abundance. We developed a model that uses aerial survey data from the western United States to calibrate high‐resolution annual estimates of relative abundance from eBird. Using this model, we compared regional population size estimates based on the calibrated eBird information with those based on aerial survey data alone.

Population size estimates based on the calibrated eBird information had strong correspondence to estimates from aerial survey data in two out of four regions, and population trajectories based on the two approaches showed high correlations.

We demonstrate how the combination of citizen science data and targeted surveys can be used to (a) increase the spatial resolution of population size estimates, (b) extend the spatial extent of inference and (c) predict population size beyond the temporal period of surveys. Findings based on this case study can be used to refine policy metrics used by the US Fish and Wildlife Service and inform permitting regulations (e.g. mortality/harm associated with wind energy development).

Policy implications: Our results demonstrate the ability of citizen science data to complement targeted monitoring programmes and improve the efficacy of decision frameworks that require information on population size or trajectory. After validating citizen science data against survey‐based benchmarks, agencies can harness strengths of citizen science data to supplement information needs and increase the resolution and extent of population size predictions.

more » « less
Award ID(s):
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Date Published:
Journal Name:
Journal of Applied Ecology
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The research and conservation community has successfully harnessed the wealth of ecological knowledge found in unprecedented volumes of citizen science (CS) data world‐wide. However, few examples exist of the use of CS data to directly inform policy.

    Current examples of applications of CS data mainly stem from programs that are restricted in scope (e.g. defined protocols, restricted sampling time frame), and the potential use of unrestricted CS data to inform policy remains largely untapped.

    Here, we make a call for moving beyond questioning the reliability of CS data and present a case study of how the US Fish and Wildlife Service (USFWS) used information from an unrestricted CS program (eBird) to inform levels of exposure to collision risk for wind energy development.

    Policy implications. The USFWS made the technical recommendation to use eBird abundance estimates for the bald eagle as the only source of information to define low‐risk collision areas as part of the agency's wind energy permitting process. Our study contributes a clear pathway of how to realize the potential of unrestricted CS programs for generating the evidence base needed to inform policy decisions.

    more » « less
  2. Abstract

    Citizen and community science datasets are typically collected using flexible protocols. These protocols enable large volumes of data to be collected globally every year; however, the consequence is that these protocols typically lack the structure necessary to maintain consistent sampling across years. This can result in complex and pronounced interannual changes in the observation process, which can complicate the estimation of population trends because population changes over time are confounded with changes in the observation process.

    Here we describe a novel modelling approach designed to estimate spatially explicit species population trends while controlling for the interannual confounding common in citizen science data. The approach is based on Double machine learning, a statistical framework that uses machine learning (ML) methods to estimate population change and the propensity scores used to adjust for confounding discovered in the data. ML makes it possible to use large sets of features to control for confounding and to model spatial heterogeneity in trends. Additionally, we present a simulation method to identify and adjust for residual confounding missed by the propensity scores.

    To illustrate the approach, we estimated species trends using data from the citizen science project eBird. We used a simulation study to assess the ability of the method to estimate spatially varying trends when faced with realistic confounding and temporal correlation. Results demonstrated the ability to distinguish between spatially constant and spatially varying trends. There were low error rates on the estimated direction of population change (increasing/decreasing) at each location and high correlations on the estimated magnitude of population change.

    The ability to estimate spatially explicit trends while accounting for confounding inherent in citizen science data has the potential to fill important information gaps, helping to estimate population trends for species and/or regions lacking rigorous monitoring data.

    more » « less
  3. Abstract

    Aircraft collisions with birds span the entire history of human aviation, including fatal collisions during some of the first powered human flights. Much effort has been expended to reduce such collisions, but increased knowledge about bird movements and species occurrence could dramatically improve decision support and proactive measures to reduce them. Migratory movements of birds pose a unique, often overlooked, threat to aviation that is particularly difficult for individual airports to monitor and predict the occurrence of birds vary extensively in space and time at the local scales of airport responses.

    We use two publicly available datasets, radar data from the US NEXRAD network characterizing migration movements and eBird data collected by citizen scientists to map bird movements and species composition with low human effort expenditures but high temporal and spatial resolution relative to other large‐scale bird survey methods. As a test case, we compare results from weather radar distributions and eBird species composition with detailed bird strike records from three major New York airports.

    We show that weather radar‐based estimates of migration intensity can accurately predict the probability of bird strikes, with 80% of the variation in bird strikes across the year explained by the average amount of migratory movements captured on weather radar. We also show that eBird‐based estimates of species occurrence can, using species’ body mass and flocking propensity, accurately predict when most damaging strikes occur.

    Synthesis and applications. By better understanding when and where different bird species occur, airports across the world can predict seasonal periods of collision risks with greater temporal and spatial resolution; such predictions include potential to predict when the most severe and damaging strikes may occur. Our results highlight the power of federating datasets with bird movement and distribution data for developing better and more taxonomically and ecologically tuned models of likelihood of strikes occurring and severity of strikes.

    more » « less
  4. Primary biodiversity data records that are open access and available in a standardised format are essential for conservation planning and research on policy-relevant time-scales. We created a dataset to document all known occurrence data for the Federally Endangered Poweshiek skipperling butterfly [ Oarismapoweshiek (Parker, 1870; Lepidoptera: Hesperiidae)]. The Poweshiek skipperling was a historically common species in prairie systems across the upper Midwest, United States and Manitoba, Canada. Rapid declines have reduced the number of verified extant sites to six. Aggregating and curating Poweshiek skipperling occurrence records documents and preserves all known distributional data, which can be used to address questions related to Poweshiek skipperling conservation, ecology and biogeography. Over 3500 occurrence records were aggregated over a temporal coverage from 1872 to present. Occurrence records were obtained from 37 data providers in the conservation and natural history collection community using both “HumanObservation” and “PreservedSpecimen” as an acceptable basisOfRecord. Data were obtained in different formats and with differing degrees of quality control. During the data aggregation and cleaning process, we transcribed specimen label data, georeferenced occurrences, adopted a controlled vocabulary, removed duplicates and standardised formatting. We examined the dataset for inconsistencies with known Poweshiek skipperling biogeography and phenology and we verified or removed inconsistencies by working with the original data providers. In total, 12 occurrence records were removed because we identified them to be the western congener Oarismagarita (Reakirt, 1866). This resulting dataset enhances the permanency of Poweshiek skipperling occurrence data in a standardised format. This is a validated and comprehensive dataset of occurrence records for the Poweshiek skipperling ( Oarismapoweshiek ) utilising both observation and specimen-based records. Occurrence data are preserved and available for continued research and conservation projects using standardised Darwin Core formatting where possible. Prior to this project, much of these occurrence records were not mobilised and were being stored in individual institutional databases, researcher datasets and personal records. This dataset aggregates presence data from state conservation agencies, natural heritage programmes, natural history collections, citizen scientists, researchers and the U.S. Fish & Wildlife Service. The data include opportunistic observations and collections, research vouchers, observations collected for population monitoring and observations collected using standardised research methodologies. The aggregated occurrence records underwent cleaning efforts that improved data interoperablitity, removed transcription errors and verified or removed uncertain data. This dataset enhances available information on the spatiotemporal distribution of this Federally Endangered species. As part of this aggregation process, we discovered and verified Poweshiek skipperling occurrence records from two previously unknown states, Nebraska and Ohio. 
    more » « less
  5. Abstract

    Monitoring wildlife abundance across space and time is an essential task to study their population dynamics and inform effective management. Acoustic recording units are a promising technology for efficiently monitoring bird populations and communities. While current acoustic data models provide information on the presence/absence of individual species, new approaches are needed to monitor population abundance, ideally across large spatio‐temporal regions.

    We present an integrated modelling framework that combines high‐quality but temporally sparse bird point count survey data with acoustic recordings. Our models account for imperfect detection in both data types and false positive errors in the acoustic data. Using simulations, we compare the accuracy and precision of abundance estimates using differing amounts of acoustic vocalizations obtained from a clustering algorithm, point count data, and a subset of manually validated acoustic vocalizations. We also use our modelling framework in a case study to estimate abundance of the Eastern Wood‐Pewee (Contopus virens) in Vermont, USA.

    The simulation study reveals that combining acoustic and point count data via an integrated model improves accuracy and precision of abundance estimates compared with models informed by either acoustic or point count data alone. Improved estimates are obtained across a wide range of scenarios, with the largest gains occurring when detection probability for the point count data is low. Combining acoustic data with only a small number of point count surveys yields estimates of abundance without the need for validating any of the identified vocalizations from the acoustic data. Within our case study, the integrated models provided moderate support for a decline of the Eastern Wood‐Pewee in this region.

    Our integrated modelling approach combines dense acoustic data with few point count surveys to deliver reliable estimates of species abundance without the need for manual identification of acoustic vocalizations or a prohibitively expensive large number of repeated point count surveys. Our proposed approach offers an efficient monitoring alternative for large spatio‐temporal regions when point count data are difficult to obtain or when monitoring is focused on rare species with low detection probability.

    more » « less