skip to main content


Title: Occupancy–detection models with museum specimen data: Promise and pitfalls
Abstract

Historical museum records provide potentially useful data for identifying drivers of change in species occupancy. However, because museum records are typically obtained via many collection methods, methodological developments are needed to enable robust inferences. Occupancy–detection models, a relatively new and powerful suite of statistical methods, are a potentially promising avenue because they can account for changes in collection effort through space and time.

We use simulated datasets to identify how and when patterns in data and/or modelling decisions can bias inference. We focus primarily on the consequences of contrasting methodological approaches for dealing with species' ranges and inferring species' non‐detections in both space and time.

We find that not all datasets are suitable for occupancy–detection analysis but, under the right conditions (namely, datasets that are broken into more time periods for occupancy inference and that contain a high fraction of community‐wide collections, or collection events that focus on communities of organisms), models can accurately estimate trends. Finally, we present a case study on eastern North American odonates where we calculate long‐term trends of occupancy using our most robust workflow.

These results indicate that occupancy–detection models are a suitable framework for some research cases and expand the suite of available tools for macroecological analysis available to researchers, especially where structured datasets are unavailable.

 
more » « less
PAR ID:
10396009
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
14
Issue:
2
ISSN:
2041-210X
Format(s):
Medium: X Size: p. 402-414
Size(s):
p. 402-414
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Dated, geo‐referenced museum specimens are a rich data source for reconstructing species' distribution and abundance patterns. However, museum records are potentially biased towards over‐representation of rare species, and it is unclear whether museum records can be used to estimate relative abundance in the field.

    We assembled 17 coupled field and museum datasets to quantitatively compare relative abundance estimates with the Dirichlet distribution. Collectively, these datasets comprise 73,039 museum records and 1,405,316 field observations of 2,240 species.

    Although museum records of rare species overestimated relative abundance by 1‐fold to over 100‐fold (median study = 9.0), the relative abundance of species estimated from museum occurrence records was strongly correlated with relative abundance estimated from standardized field surveys (r2range of 0.10–0.91, median study = 0.43).

    These analyses provide a justification for estimating species relative abundance with carefully curated museum occurrence records, which may allow for the detection of temporal or spatial shifts in the rank ordering of common and rare species.

     
    more » « less
  2. Abstract

    Occupancy modelling is a common approach to assess species distribution patterns, while explicitly accounting for false absences in detection–nondetection data. Numerous extensions of the basic single‐species occupancy model exist to model multiple species, spatial autocorrelation and to integrate multiple data types. However, development of specialized and computationally efficient software to incorporate such extensions, especially for large datasets, is scarce or absent.

    We introduce thespOccupancy Rpackage designed to fit single‐species and multi‐species spatially explicit occupancy models. We fit all models within a Bayesian framework using Pólya‐Gamma data augmentation, which results in fast and efficient inference.spOccupancyprovides functionality for data integration of multiple single‐species detection–nondetection datasets via a joint likelihood framework. The package leverages Nearest Neighbour Gaussian Processes to account for spatial autocorrelation, which enables spatially explicit occupancy modelling for potentially massive datasets (e.g. 1,000s–100,000s of sites).

    spOccupancyprovides user‐friendly functions for data simulation, model fitting, model validation (by posterior predictive checks), model comparison (using information criteria and k‐fold cross‐validation) and out‐of‐sample prediction. We illustrate the package's functionality via a vignette, simulated data analysis and two bird case studies.

    ThespOccupancypackage provides a user‐friendly platform to fit a variety of single and multi‐species occupancy models, making it straightforward to address detection biases and spatial autocorrelation in species distribution models even for large datasets.

     
    more » « less
  3. Abstract

    Natural history collections (NHC) provide a wealth of information that can be used to understand the impacts of global change on biodiversity. As such, there is growing interest in using NHC data to estimate changes in species' distributions and abundance trends over historic time horizons when contemporary survey data are limited or unavailable.

    However, museum specimens were not collected with the purpose of estimating population trends and thus can exhibit spatiotemporal and collector‐specific biases that can impose severe limitations to using NHC data for evaluating population trajectories.

    Here we review the challenges associated with using museum records to track long‐term insect population trends, including spatiotemporal biases in sampling effort and sparse temporal coverage within and across years. We highlight recent methodological advancements that aim to overcome these challenges and discuss emerging research opportunities.

    Specifically, we examine the potential of integrating museum records and other contemporary data sources (e.g. collected via structured, designed surveys and opportunistic citizen science programs) in a unified analytical framework that accounts for the sampling biases associated with each data source. The emerging field of integrated modelling provides a promising framework for leveraging the wealth of collections data to accurately estimate long‐term trends of insect populations and identify cases where that is not possible using existing data sources.

     
    more » « less
  4. Abstract

    Landscape‐scale bioacoustic projects have become a popular approach to biodiversity monitoring. Combining passive acoustic monitoring recordings and automated detection provides an effective means of monitoring sound‐producing species' occupancy and phenology and can lend insight into unobserved behaviours and patterns. The availability of low‐cost recording hardware has lowered barriers to large‐scale data collection, but technological barriers in data analysis remain a bottleneck for extracting biological insight from bioacoustic datasets.

    We provide a robust and open‐source Python toolkit for detecting and localizing biological sounds in acoustic data.

    OpenSoundscape provides access to automated acoustic detection, classification and localization methods through a simple and easy‐to‐use set of tools. Extensive documentation and tutorials provide step‐by‐step instructions and examples of end‐to‐end analysis of bioacoustic data. Here, we describe the functionality of this package and provide concise examples of bioacoustic analyses with OpenSoundscape.

    By providing an interface for bioacoustic data and methods, we hope this package will lead to increased adoption of bioacoustics methods and ultimately to enhanced insights for ecology and conservation.

     
    more » « less
  5. Abstract

    Understanding patterns of diversity is central to ecology and conservation, yet estimates of diversity are often biased by imperfect detection. In recent years, multi‐species occupancy models (MSOM) have been developed as a statistical tool to account for species‐specific heterogeneity in detection while estimating true measures of diversity. Although the power of these models has been tested in various ways, their ability to estimate gamma diversity—or true community size,Nis a largely unrecognized feature that needs rigorous evaluation.

    We use both simulations and an empirical dataset to evaluate the bias, precision, accuracy and coverage of estimates ofNfrom MSOM compared to the widely applied iChao2 non‐parametric estimator. We simulated 5,600 datasets across seven scenarios of varying average occupancy and detectability covariates, as well as varying numbers of sites, replicates and true community size. Additionally, we use a real dataset of surveys over 9 years (where species accumulation reached an asymptote, indicating trueN), to estimateNfrom each annual survey.

    Simulations showed that both MSOM and iChao2 estimators are generally accurate (i.e. unbiased and precise) except under unideal scenarios where mean species occupancy is low. In such scenarios, MSOM frequently overestimatedN. Across all scenarios, MSOM estimates were less certain than iChao2, but this led to over‐confident iChao2 estimates that showed poor coverage. Results from the real dataset largely confirmed the simulation findings, with MSOM estimates showing greater accuracy and coverage than iChao2.

    Community ecologists have a wide choice of analytical methods, and both iChao2 and MSOM estimates ofNare substantially preferable to raw species counts. The simplicity of non‐parametric estimators has obvious advantages, but our results show that in many cases, MSOM may provide superior estimates that also account more accurately for uncertainty. Both methods can show strong bias when average occupancy is very low, and practitioners should show caution when using estimates derived from either method under such conditions.

     
    more » « less