skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Considerations for fitting occupancy models to data from eBird and similar volunteer-collected data
Abstract An occupancy model makes use of data that are structured as sets of repeated visits to each of many sites, in order to estimate the actual probability of occupancy (i.e. proportion of occupied sites) after correcting for imperfect detection using the information contained in the sets of repeated observations. We explore the conditions under which preexisting, volunteer-collected data from the citizen science project eBird can be used for fitting occupancy models. Because the majority of eBird’s data are not collected in the form of repeated observations at individual locations, we explore 2 ways in which the single-visit records could be used in occupancy models. First, we assess the potential for space-for-time substitution: aggregating single-visit records from different locations within a region into pseudo-repeat visits. On average, eBird’s observers did not make their observations at locations that were representative of the habitat in the surrounding area, which would lead to biased estimates of occupancy probabilities when using space-for-time substitution. Thus, the use of space-for-time substitution is not always appropriate. Second, we explored the utility of including data from single-visit records to supplement sets of repeated-visit data. In a simulation study we found that inclusion of single-visit records increased the precision of occupancy estimates, but only when detection probabilities are high. When detection probability was low, the addition of single-visit records exacerbated biases in estimates of occupancy probability. We conclude that subsets of data from eBird, and likely from similar projects, can be used for occupancy modeling either using space-for-time substitution or supplementing repeated-visit data with data from single-visit records. The appropriateness of either alternative will depend on the goals of a study and on the probabilities of detection and occupancy of the species of interest.  more » « less
Award ID(s):
1927646
PAR ID:
10443443
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Ornithology
Volume:
140
Issue:
4
ISSN:
0004-8038
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Citizen science biodiversity data present great opportunities for ecology and conservation across vast spatial and temporal scales. However, the opportunistic nature of these data lacks the sampling structure required by modeling methodologies that address a pervasive challenge in ecological data collection: imperfect detection, i.e., the likelihood of under-observing species on field surveys. Occupancy modeling is an example of an approach that accounts for imperfect detection by explicitly modeling the observation process separately from the biological process of habitat selection. This produces species distribution models that speak to the pattern of the species on a landscape after accounting for imperfect detection in the data, rather than the pattern of species observations corrupted by errors. To achieve this benefit, occupancy models require multiple surveys of a site across which the site's status (i.e., occupied or not) is assumed constant. Since citizen science data are not collected under the required repeated-visit protocol, observations may be grouped into sites post hoc. Existing approaches for constructing sites discard some observations and/or consider only geographic distance and not environmental similarity. In this study, we compare ten approaches for site construction in terms of their impact on downstream species distribution models for 31 bird species in Oregon, using observations recorded in the eBird database. We find that occupancy models built on sites constructed by spatial clustering algorithms perform better than existing alternatives. 
    more » « less
  2. Long-term monitoring of habitat occupancy can reveal patterns of habitat use, population dynamics, and factors controlling species distribution. The American pika (Ochotona princeps), a small mammal found in rocky habitats throughout western North America, has been targeted for occupancy studies due to its relatively conspicuous behavior and its unusual adaptations for surviving long, cold winters without hibernation. These adaptations include an unusually high resting metabolic rate and maintenance of body temperatures near the lethal maximum for this species, which would appear to compromise the pika's ability to survive warmer summers. Recent monitoring as well as projections based on future climate scenarios have suggested this species is experiencing a period of range retraction due to warming summers and/or loss of insulating winter snow cover. Niwot Ridge is situated ideally to test competing hypotheses about the trajectory and drivers of pika range shift. The pika is still common throughout the Colorado Rockies, but published models differ markedly regarding projections of the pika’s future distribution in this region. Niwot Ridge has experienced warmer summers as well as shorter periods of insulating snow cover in recent years, and there is evidence that pikas are now less common than they once were in at least one area on the ridge. This study is designed to provide robust data on pika population trends through long-term monitoring of occupancy in a spatially balanced random sample of pika habitat patches centered on Niwot Ridge. Survey plots (n = 72) were selected according to a Generalized Random-Tessellation Stratified (GRTS) algorithm, stratified dichotomously by elevation, average annual snow accumulation (SWE), and probabilities of pika occurrence based on previous data. Each plot extends 12 m in radius from a GRTS point. To ensure that each plot contains at least 10% cover of talus, plot coordinates were adjusted (usually less than 50 m) or replaced using the GRTS oversample to select the next available and suitable plot within the same categories of elevation, SWE and probability of occurrence (see "pika-survey-GRTS-plot-tracking-record.cr.data.csv" for plot strata, survey schedules, GRTS sequence, and records of plot replacement or location adjustments). Trained technicians survey plots for pikas and fresh pika sign (food caches and fecal pellets) as well as metrics of habitat quality. Each year, 48 of the 72 plots are surveyed in a rotating panel design (24 plots are surveyed annually, 24 in even years and 24 in odd years). Plots are surveyed in August when pikas are engaged in food caching and other conspicuous behaviors related to territory establishment and defense. Data collected at each plot are detailed in a survey manual ("pika_survey.cr.methods.docx"). Each plot is outfitted with a data logger (sensor) to record sub-surface temperature several times each day. Photos of plot and sensor locations are used in navigation and sensor retrieval. Each survey is completed during a brief (half-hour) visit to the plot to service the sensor and to record habitat and pika data. A subset of plots (n = 12) are selected for double surveys each year to allow estimation of pika detection probability. Estimates of detection probability are also informed by data on time to detection of pikas and pika sign recorded during each survey. Samples of fresh pika fecal pellets are collected from occupied plots and are stored as vouchers of pika presence and for use in studies of population genetics and physiology, including studies of physiological stress in relation to habitat quality and microclimate. 
    more » « less
  3. Long-term monitoring of habitat occupancy can reveal patterns of habitat use, population dynamics, and factors controlling species distribution. The American pika (Ochotona princeps), a small mammal found in rocky habitats throughout western North America, has been targeted for occupancy studies due to its relatively conspicuous behavior and its unusual adaptations for surviving long, cold winters without hibernation. These adaptations include an unusually high resting metabolic rate and maintenance of body temperatures near the lethal maximum for this species, which would appear to compromise the pika's ability to survive warmer summers. Recent monitoring as well as projections based on future climate scenarios have suggested this species is experiencing a period of range retraction due to warming summers and/or loss of insulating winter snow cover. Niwot Ridge is situated ideally to test competing hypotheses about the trajectory and drivers of pika range shift. The pika is still common throughout the Colorado Rockies, but published models differ markedly regarding projections of the pika’s future distribution in this region. Niwot Ridge has experienced warmer summers as well as shorter periods of insulating snow cover in recent years, and there is evidence that pikas are now less common than they once were in at least one area on the ridge. This study is designed to provide robust data on pika population trends through long-term monitoring of occupancy in a spatially balanced random sample of pika habitat patches centered on Niwot Ridge. Survey plots (n = 72) were selected according to a Generalized Random-Tessellation Stratified (GRTS) algorithm, stratified dichotomously by elevation, average annual snow accumulation (SWE), and probabilities of pika occurrence based on previous data. Each plot extends 12 m in radius from a GRTS point. To ensure that each plot contains at least 10% cover of talus, plot coordinates were adjusted (usually less than 50 m) or replaced using the GRTS oversample to select the next available and suitable plot within the same categories of elevation, SWE and probability of occurrence (see "pika-survey-GRTS-plot-tracking-record.cr.data.csv" for plot strata, survey schedules, GRTS sequence, and records of plot replacement or location adjustments). Trained technicians survey plots for pikas and fresh pika sign (food caches and fecal pellets) as well as metrics of habitat quality. Each year, 48 of the 72 plots are surveyed in a rotating panel design (24 plots are surveyed annually, 24 in even years and 24 in odd years). Plots are surveyed in August when pikas are engaged in food caching and other conspicuous behaviors related to territory establishment and defense. Data collected at each plot are detailed in a survey manual ("pika_survey.cr.methods.docx"). Each plot is outfitted with a data logger (sensor) to record sub-surface temperature several times each day. Photos of plot and sensor locations are used in navigation and sensor retrieval. Each survey is completed during a brief (half-hour) visit to the plot to service the sensor and to record habitat and pika data. A subset of plots (n = 12) are selected for double surveys each year to allow estimation of pika detection probability. Estimates of detection probability are also informed by data on time to detection of pikas and pika sign recorded during each survey. Samples of fresh pika fecal pellets are collected from occupied plots and are stored as vouchers of pika presence and for use in studies of population genetics and physiology, including studies of physiological stress in relation to habitat quality and microclimate. 
    more » « less
  4. Abstract Effective conservation requires understanding species’ abundance patterns and demographic rates across space and time. Ideally, such knowledge should be available for whole communities because variation in species’ dynamics can elucidate factors leading to biodiversity losses. However, collecting data to simultaneously estimate abundance and demographic rates of communities of species is often prohibitively time intensive and expensive. We developed a multispecies dynamicN‐occupancy model to estimate unbiased, community‐wide relative abundance and demographic rates. In this model, detection–nondetection data (e.g., repeated presence–absence surveys) are used to estimate species‐ and community‐level parameters and the effects of environmental factors. To validate our model, we conducted a simulation study to determine how and when such an approach can be valuable and found that our multispecies model outperformed comparable single‐species models in estimating abundance and demographic rates in many cases. Using data from a network of camera traps across tropical equatorial Africa, we then used our model to evaluate the statuses and trends of a forest‐dwelling antelope community. We estimated relative abundance, rates of recruitment (i.e., reproduction and immigration), and apparent survival probabilities for each species’ local population. The antelope community was fairly stable (although 17% of populations [species–park combinations] declined over the study period). Variation in apparent survival was linked more closely to differences among national parks than to individual species’ life histories. The multispecies dynamicN‐occupancy model requires only detection–nondetection data to evaluate the population dynamics of multiple sympatric species and can thus be a valuable tool for examining the reasons behind recent biodiversity loss. 
    more » « less
  5. Abstract Aircraft collisions with birds span the entire history of human aviation, including fatal collisions during some of the first powered human flights. Much effort has been expended to reduce such collisions, but increased knowledge about bird movements and species occurrence could dramatically improve decision support and proactive measures to reduce them. Migratory movements of birds pose a unique, often overlooked, threat to aviation that is particularly difficult for individual airports to monitor and predict the occurrence of birds vary extensively in space and time at the local scales of airport responses.We use two publicly available datasets, radar data from the US NEXRAD network characterizing migration movements and eBird data collected by citizen scientists to map bird movements and species composition with low human effort expenditures but high temporal and spatial resolution relative to other large‐scale bird survey methods. As a test case, we compare results from weather radar distributions and eBird species composition with detailed bird strike records from three major New York airports.We show that weather radar‐based estimates of migration intensity can accurately predict the probability of bird strikes, with 80% of the variation in bird strikes across the year explained by the average amount of migratory movements captured on weather radar. We also show that eBird‐based estimates of species occurrence can, using species’ body mass and flocking propensity, accurately predict when most damaging strikes occur.Synthesis and applications. By better understanding when and where different bird species occur, airports across the world can predict seasonal periods of collision risks with greater temporal and spatial resolution; such predictions include potential to predict when the most severe and damaging strikes may occur. Our results highlight the power of federating datasets with bird movement and distribution data for developing better and more taxonomically and ecologically tuned models of likelihood of strikes occurring and severity of strikes. 
    more » « less