skip to main content


The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.

Title: A penalized likelihood for multispecies occupancy models improves predictions of species interactions

Multispecies occupancy models estimate dependence among multiple species of interest from patterns of co‐occurrence, but problems associated with separation and boundary estimates can lead to unreasonably large estimates of parameters and associated standard errors when species are rarely observed at the same site or when data are sparse. In this paper, we overcome these issues by implementing a penalized likelihood, which introduces a small bias in parameter estimates in exchange for a potentially large reduction in variance. We compare parameter estimates obtained from both penalized and unpenalized multispecies occupancy models fit to simulated data that exhibit various degrees of separation and to a real‐word data set of bird surveys with little apparent overlap between potentially interacting species. Our simulation results demonstrate that penalized multispecies occupancy models did not exhibit boundary estimates and produced lower bias, lower mean squared error, and improved inference relative to unpenalized models. When applied to real‐world data, our penalized multispecies occupancy model constrained boundary estimates and allowed for meaningful inference related to the interactions of two species of conservation concern. To facilitate the use of our penalized multispecies occupancy model, the techniques demonstrated in this paper have been integrated into theunmarkedpackage in R programing language.

more » « less
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Understanding patterns of diversity is central to ecology and conservation, yet estimates of diversity are often biased by imperfect detection. In recent years, multi‐species occupancy models (MSOM) have been developed as a statistical tool to account for species‐specific heterogeneity in detection while estimating true measures of diversity. Although the power of these models has been tested in various ways, their ability to estimate gamma diversity—or true community size,Nis a largely unrecognized feature that needs rigorous evaluation.

    We use both simulations and an empirical dataset to evaluate the bias, precision, accuracy and coverage of estimates ofNfrom MSOM compared to the widely applied iChao2 non‐parametric estimator. We simulated 5,600 datasets across seven scenarios of varying average occupancy and detectability covariates, as well as varying numbers of sites, replicates and true community size. Additionally, we use a real dataset of surveys over 9 years (where species accumulation reached an asymptote, indicating trueN), to estimateNfrom each annual survey.

    Simulations showed that both MSOM and iChao2 estimators are generally accurate (i.e. unbiased and precise) except under unideal scenarios where mean species occupancy is low. In such scenarios, MSOM frequently overestimatedN. Across all scenarios, MSOM estimates were less certain than iChao2, but this led to over‐confident iChao2 estimates that showed poor coverage. Results from the real dataset largely confirmed the simulation findings, with MSOM estimates showing greater accuracy and coverage than iChao2.

    Community ecologists have a wide choice of analytical methods, and both iChao2 and MSOM estimates ofNare substantially preferable to raw species counts. The simplicity of non‐parametric estimators has obvious advantages, but our results show that in many cases, MSOM may provide superior estimates that also account more accurately for uncertainty. Both methods can show strong bias when average occupancy is very low, and practitioners should show caution when using estimates derived from either method under such conditions.

    more » « less
  2. Abstract

    Historical museum records provide potentially useful data for identifying drivers of change in species occupancy. However, because museum records are typically obtained via many collection methods, methodological developments are needed to enable robust inferences. Occupancy–detection models, a relatively new and powerful suite of statistical methods, are a potentially promising avenue because they can account for changes in collection effort through space and time.

    We use simulated datasets to identify how and when patterns in data and/or modelling decisions can bias inference. We focus primarily on the consequences of contrasting methodological approaches for dealing with species' ranges and inferring species' non‐detections in both space and time.

    We find that not all datasets are suitable for occupancy–detection analysis but, under the right conditions (namely, datasets that are broken into more time periods for occupancy inference and that contain a high fraction of community‐wide collections, or collection events that focus on communities of organisms), models can accurately estimate trends. Finally, we present a case study on eastern North American odonates where we calculate long‐term trends of occupancy using our most robust workflow.

    These results indicate that occupancy–detection models are a suitable framework for some research cases and expand the suite of available tools for macroecological analysis available to researchers, especially where structured datasets are unavailable.

    more » « less
  3. Abstract

    Effective conservation requires understanding species’ abundance patterns and demographic rates across space and time. Ideally, such knowledge should be available for whole communities because variation in species’ dynamics can elucidate factors leading to biodiversity losses. However, collecting data to simultaneously estimate abundance and demographic rates of communities of species is often prohibitively time intensive and expensive. We developed a multispecies dynamicN‐occupancy model to estimate unbiased, community‐wide relative abundance and demographic rates. In this model, detection–nondetection data (e.g., repeated presence–absence surveys) are used to estimate species‐ and community‐level parameters and the effects of environmental factors. To validate our model, we conducted a simulation study to determine how and when such an approach can be valuable and found that our multispecies model outperformed comparable single‐species models in estimating abundance and demographic rates in many cases. Using data from a network of camera traps across tropical equatorial Africa, we then used our model to evaluate the statuses and trends of a forest‐dwelling antelope community. We estimated relative abundance, rates of recruitment (i.e., reproduction and immigration), and apparent survival probabilities for each species’ local population. The antelope community was fairly stable (although 17% of populations [species–park combinations] declined over the study period). Variation in apparent survival was linked more closely to differences among national parks than to individual species’ life histories. The multispecies dynamicN‐occupancy model requires only detection–nondetection data to evaluate the population dynamics of multiple sympatric species and can thus be a valuable tool for examining the reasons behind recent biodiversity loss.

    more » « less
  4. Abstract

    Site occupancy models (SOMs) are a common tool for studying the spatial ecology of wildlife. When observational data are collected using passive monitoring field methods, including camera traps or autonomous recorders, detections of animals may be temporally autocorrelated, leading to biased estimates and incorrectly quantified uncertainty. We presently lack clear guidance for understanding and mitigating the consequences of temporal autocorrelation when estimating occupancy models with camera trap data.

    We use simulations to explore when and how autocorrelation gives rise to biased or overconfident estimates of occupancy. We explore the impact of sampling design and biological conditions on model performance in the presence of autocorrelation, investigate the usefulness of several techniques for identifying and mitigating bias and compare performance of the SOM to a model that explicitly estimates autocorrelation. We also conduct a case study using detections of 22 North American mammals.

    We show that a join count goodness‐of‐fit test previously proposed for identifying clustered detections is effective for detecting autocorrelation across a range of conditions. We find that strong bias occurs in the estimated occupancy intercept when survey durations are short and detection rates are low. We provide a reference table for assessing the degree of bias to be expected under all conditions. We further find that discretizing data with larger windows decreases the magnitude of bias introduced by autocorrelation. In our case study, we find that detections of most species are autocorrelated and demonstrate how larger detection windows might mitigate the resulting bias.

    Our findings suggest that autocorrelation is likely widespread in camera trap data and that many previous studies of occupancy based on camera trap data may have systematically underestimated occupancy probabilities. Moving forward, we recommend that ecologists estimating occupancy from camera trap data use the join count goodness‐of‐fit test to determine whether autocorrelation is present in their data. If it is, SOMs should use large detection windows to mitigate bias and more accurately quantify uncertainty in occupancy model parameters. Ecologists should not use gaps between detection periods, which are ineffective at mitigating temporal structure in data and discard useful data.

    more » « less
  5. Abstract

    Merging robust statistical methods with complex simulation models is a frontier for improving ecological inference and forecasting. However, bringing these tools together is not always straightforward. Matching data with model output, determining starting conditions, and addressing high dimensionality are some of the complexities that arise when attempting to incorporate ecological field data with mechanistic models directly using sophisticated statistical methods. To illustrate these complexities and pragmatic paths forward, we present an analysis using tree‐ring basal area reconstructions in Denali National Park (DNPP) to constrain successional trajectories of two spruce species (Picea marianaandPicea glauca) simulated by a forest gap model, University of Virginia Forest Model Enhanced—UVAFME. Through this process, we provide preliminary ecological inference about the long‐term competitive dynamics between slow‐growingP. marianaand relatively faster‐growingP. glauca. Incorporating tree‐ring data into UVAFME allowed us to estimate a bias correction for stand age with improved parameter estimates. We found that higher parameter values forP. marianaminimum growth under stress andP. glaucamaximum growth rate were key to improving simulations of coexistence, agreeing with recent research that faster‐growingP. glaucamay outcompeteP. marianaunder climate change scenarios. The implementation challenges we highlight are a crucial part of the conversation for how to bring models together with data to improve ecological inference and forecasting.

    more » « less