skip to main content


Title: Integrating automated acoustic vocalization data and point count surveys for estimation of bird abundance
Abstract

Monitoring wildlife abundance across space and time is an essential task to study their population dynamics and inform effective management. Acoustic recording units are a promising technology for efficiently monitoring bird populations and communities. While current acoustic data models provide information on the presence/absence of individual species, new approaches are needed to monitor population abundance, ideally across large spatio‐temporal regions.

We present an integrated modelling framework that combines high‐quality but temporally sparse bird point count survey data with acoustic recordings. Our models account for imperfect detection in both data types and false positive errors in the acoustic data. Using simulations, we compare the accuracy and precision of abundance estimates using differing amounts of acoustic vocalizations obtained from a clustering algorithm, point count data, and a subset of manually validated acoustic vocalizations. We also use our modelling framework in a case study to estimate abundance of the Eastern Wood‐Pewee (Contopus virens) in Vermont, USA.

The simulation study reveals that combining acoustic and point count data via an integrated model improves accuracy and precision of abundance estimates compared with models informed by either acoustic or point count data alone. Improved estimates are obtained across a wide range of scenarios, with the largest gains occurring when detection probability for the point count data is low. Combining acoustic data with only a small number of point count surveys yields estimates of abundance without the need for validating any of the identified vocalizations from the acoustic data. Within our case study, the integrated models provided moderate support for a decline of the Eastern Wood‐Pewee in this region.

Our integrated modelling approach combines dense acoustic data with few point count surveys to deliver reliable estimates of species abundance without the need for manual identification of acoustic vocalizations or a prohibitively expensive large number of repeated point count surveys. Our proposed approach offers an efficient monitoring alternative for large spatio‐temporal regions when point count data are difficult to obtain or when monitoring is focused on rare species with low detection probability.

 
more » « less
Award ID(s):
1954406 1916395
NSF-PAR ID:
10450777
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
12
Issue:
6
ISSN:
2041-210X
Page Range / eLocation ID:
p. 1040-1049
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The interface between field biology and technology is energizing the collection of vast quantities of environmental data. Passive acoustic monitoring, the use of unattended recording devices to capture environmental sound, is an example where technological advances have facilitated an influx of data that routinely exceeds the capacity for analysis. Computational advances, particularly the integration of machine learning approaches, will support data extraction efforts. However, the analysis and interpretation of these data will require parallel growth in conceptual and technical approaches for data analysis. Here, we use a large hand‐annotated dataset to showcase analysis approaches that will become increasingly useful as datasets grow and data extraction can be partially automated.

    We propose and demonstrate seven technical approaches for analyzing bioacoustic data. These include the following: (1) generating species lists and descriptions of vocal variation, (2) assessing how abiotic factors (e.g., rain and wind) impact vocalization rates, (3) testing for differences in community vocalization activity across sites and habitat types, (4) quantifying the phenology of vocal activity, (5) testing for spatiotemporal correlations in vocalizations within species, (6) among species, and (7) using rarefaction analysis to quantify diversity and optimize bioacoustic sampling.

    To demonstrate these approaches, we sampled in 2016 and 2018 and used hand annotations of 129,866 bird vocalizations from two forests in New Hampshire, USA, including sites in the Hubbard Brook Experiment Forest where bioacoustic data could be integrated with more than 50 years of observer‐based avian studies. Acoustic monitoring revealed differences in community patterns in vocalization activity between forests of different ages, as well as between nearby similar watersheds. Of numerous environmental variables that were evaluated, background noise was most clearly related to vocalization rates. The songbird community included one cluster of species where vocalization rates declined as ambient noise increased and another cluster where vocalization rates declined over the nesting season. In some common species, the number of vocalizations produced per day was correlated at scales of up to 15 km. Rarefaction analyses showed that adding sampling sites increased species detections more than adding sampling days.

    Although our analyses used hand‐annotated data, the methods will extend readily to large‐scale automated detection of vocalization events. Such data are likely to become increasingly available as autonomous recording units become more advanced, affordable, and power efficient. Passive acoustic monitoring with human or automated identification at the species level offers growing potential to complement observer‐based studies of avian ecology.

     
    more » « less
  2. Abstract

    Environmental and anthropogenic factors affect the population dynamics of migratory species throughout their annual cycles. However, identifying the spatiotemporal drivers of migratory species' abundances is difficult because of extensive gaps in monitoring data. The collection of unstructured opportunistic data by volunteer (citizen science) networks provides a solution to address data gaps for locations and time periods during which structured, design‐based data are difficult or impossible to collect.

    To estimate population abundance and distribution at broad spatiotemporal extents, we developed an integrated model that incorporates unstructured data during time periods and spatial locations when structured data are unavailable. We validated our approach through simulations and then applied the framework to the eastern North American migratory population of monarch butterflies during their spring breeding period in eastern Texas. Spring climate conditions have been identified as a key driver of monarch population sizes during subsequent summer and winter periods. However, low monarch densities during the spring combined with very few design‐based surveys in the region have limited the ability to isolate effects of spring weather variables on monarchs.

    Simulation results confirmed the ability of our integrated model to accurately and precisely estimate abundance indices and the effects of covariates during locations and time periods in which structured sampling are lacking. In our case study, we combined opportunistic monarch observations during the spring migration and breeding period with structured data from the summer Midwestern breeding grounds. Our model revealed a nonstationary relationship between weather conditions and local monarch abundance during the spring, driven by spatially varying vegetation and temperature conditions.

    Data for widespread and migratory species are often fragmented across multiple monitoring programs, potentially requiring the use of both structured and unstructured data sources to obtain complete geographic coverage. Our integrated model can estimate population abundance at broad spatiotemporal extents despite structured data gaps during the annual cycle by leveraging opportunistic data.

     
    more » « less
  3. Abstract

    Acoustic recordings of the environment can produce species presence–absence data for characterizing populations of sound‐producing wildlife over multiple spatial scales. If a species is present at a site but does not vocalize during a scheduled audio recording survey, researchers may incorrectly conclude that the species is absent (“false negative”). The risk of false negatives is compounded when audio devices have sampling constraints, do not record continuously, and must be manually scheduled to operate at pre‐selected times of day, particularly when research programs target multiple species with acoustic availability that varies across temporal conditions.

    We developed a temporally adaptive acoustic sampling algorithm to maximize detection probabilities for a suite of focal species amid sampling constraints. The algorithm combines user‐supplied species vocalization models with site‐specific weather forecasts to set an optimized sampling schedule for the following day. To test our algorithm, we simulated hourly vocalization probabilities for a suite of focal species in a hypothetical monitoring area for the year 2016. We conducted a factorial experiment that sampled from the 2016 acoustic environment to compare the probability of acoustic detection by a fixed (stationary) schedule versus a temporally adaptive optimized schedule under several sampling efforts and monitoring durations.

    We found that over the course of a study season, the probability of acoustically capturing a focal species (given presence) at least once via automated acoustic monitoring was greater (and acoustic capture occurred earlier in the season) when using the temporally adaptive optimized schedule as compared to a fixed schedule.

    The advantages of a temporally adaptive optimized acoustic sampling schedule are magnified when a study duration is short, sampling effort is low, and/or species acoustic availability is minimal. This methodology presents the opportunity to maximize acoustic monitoring sampling efforts amid constraints.

     
    more » « less
  4. Abstract

    1. The occurrence and distributions of wildlife populations and communities are shifting as a result of global changes. To evaluate whether these shifts are negatively impacting biodiversity processes, it is critical to monitor the status, trends and effects of environmental variables on entire communities. However, modelling the dynamics of multiple species simultaneously can require large amounts of diverse data, and few modelling approaches exist to simultaneously provide species and community‐level inferences.

    2. We present an ‘integrated community occupancy model’ (ICOM) that unites principles of data integration and hierarchical community modelling in a single framework to provide inferences on species‐specific and community occurrence dynamics using multiple data sources. The ICOM combines replicated and nonreplicated detection–nondetection data sources using a hierarchical framework that explicitly accounts for different detection and sampling processes across data sources. We use simulations to compare the ICOM to previously developed hierarchical community occupancy models and single species integrated distribution models. We then apply our model to assess the occurrence and biodiversity dynamics of foliage‐gleaning birds in the White Mountain National Forest in the northeastern USA from 2010 to 2018 using three independent data sources.

    3. Simulations reveal that integrating multiple data sources in the ICOM increased precision and accuracy of species and community‐level inferences compared to single data source models, although benefits of integration were dependent on the information content of individual data sources (e.g. amount of replication). Compared to single species models, the ICOM yielded more precise species‐level estimates. Within our case study, the ICOM had the highest out‐of‐sample predictive performance compared to single species models and models that used only a subset of the three data sources.

    4. The ICOM provides more precise estimates of occurrence dynamics compared to multi‐species models using single data sources or integrated single‐species models. We further found that the ICOM had improved predictive performance across a broad region of interest with an empirical case study of forest birds. The ICOM offers an attractive approach to estimate species and biodiversity dynamics, which is additionally valuable to inform management objectives of both individual species and their broader communities.

     
    more » « less
  5. Abstract

    Integrated population models (IPMs) have become increasingly popular for the modelling of populations, as investigators seek to combine survey and demographic data to understand processes governing population dynamics. These models are particularly useful for identifying and exploring knowledge gaps within life histories, because they allow investigators to estimate biologically meaningful parameters, such as immigration or reproduction, that were previously unidentifiable without additional data. AsIPMs have been developed relatively recently, there is much to learn about model behaviour. Behaviour of parameters, such as estimates near boundaries, and the consequences of varying degrees of dependency among datasets, has been explored. However, the reliability of parameter estimates remains underexamined, particularly when models include parameters that are not identifiable from one data source, but are indirectly identifiable from multiple datasets and a presumed model structure, such as the estimation of immigration using capture‐recapture, fecundity and count data, combined with a life‐history model.

    To examine the behaviour of model parameter estimates, we simulated stable populations closed to immigration and emigration. We simulated two scenarios that might induce error into survival estimates: marker induced bias in the capture–mark–recapture data and heterogeneity in the mortality process. We subsequently fit capture–mark–recapture, state‐space and fecundity models, as well asIPMs that estimated additional parameters.

    Simulation results suggested that when model assumptions are violated, estimation of additional, previously unidentifiable, parameters usingIPMs may be extremely sensitive to these violations of model assumption. For example, when annual marker loss was simulated, estimates of survival rates were low and estimates of immigration rate from anIPMwere high. When heterogeneity in the mortality process was induced, there were substantial relative differences between the medians of posterior distributions and truth for juvenile survival and fecundity.

    Our results have important implications for biological inference when usingIPMs, as well as future model development and implementation. Specifically, using multiple datasets to identify additional parameters resulted in the posterior distributions of additional parameters directly reflecting the effects of the violations of model assumptions in integrated modelling frameworks. We suggest that investigators interpret posterior distributions of these parameters as a combination of biological process and systematic error.

     
    more » « less