skip to main content


Title: Integrated community occupancy models: A framework to assess occurrence and biodiversity dynamics using multiple data sources
Abstract

1. The occurrence and distributions of wildlife populations and communities are shifting as a result of global changes. To evaluate whether these shifts are negatively impacting biodiversity processes, it is critical to monitor the status, trends and effects of environmental variables on entire communities. However, modelling the dynamics of multiple species simultaneously can require large amounts of diverse data, and few modelling approaches exist to simultaneously provide species and community‐level inferences.

2. We present an ‘integrated community occupancy model’ (ICOM) that unites principles of data integration and hierarchical community modelling in a single framework to provide inferences on species‐specific and community occurrence dynamics using multiple data sources. The ICOM combines replicated and nonreplicated detection–nondetection data sources using a hierarchical framework that explicitly accounts for different detection and sampling processes across data sources. We use simulations to compare the ICOM to previously developed hierarchical community occupancy models and single species integrated distribution models. We then apply our model to assess the occurrence and biodiversity dynamics of foliage‐gleaning birds in the White Mountain National Forest in the northeastern USA from 2010 to 2018 using three independent data sources.

3. Simulations reveal that integrating multiple data sources in the ICOM increased precision and accuracy of species and community‐level inferences compared to single data source models, although benefits of integration were dependent on the information content of individual data sources (e.g. amount of replication). Compared to single species models, the ICOM yielded more precise species‐level estimates. Within our case study, the ICOM had the highest out‐of‐sample predictive performance compared to single species models and models that used only a subset of the three data sources.

4. The ICOM provides more precise estimates of occurrence dynamics compared to multi‐species models using single data sources or integrated single‐species models. We further found that the ICOM had improved predictive performance across a broad region of interest with an empirical case study of forest birds. The ICOM offers an attractive approach to estimate species and biodiversity dynamics, which is additionally valuable to inform management objectives of both individual species and their broader communities.

 
more » « less
Award ID(s):
1954406
PAR ID:
10411014
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
13
Issue:
4
ISSN:
2041-210X
Page Range / eLocation ID:
p. 919-932
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Data deficiencies among rare or cryptic species preclude assessment of community‐level processes using many existing approaches, limiting our understanding of the trends and stressors for large numbers of species. Yet evaluating the dynamics of whole communities, not just common or charismatic species, is critical to understanding and the responses of biodiversity to ongoing environmental pressures.

    A recent surge in both public science and government‐funded data collection efforts has led to a wealth of biodiversity data. However, these data collection programmes use a wide range of sampling protocols (from unstructured, opportunistic observations of wildlife to well‐structured, design‐based programmes) and record information at a variety of spatiotemporal scales. As a result, available biodiversity data vary substantially in quantity and information content, which must be carefully reconciled for meaningful ecological analysis.

    Hierarchical modelling, including single‐species integrated models and hierarchical community models, has improved our ability to assess and predict biodiversity trends and processes. Here, we highlight the emerging ‘integrated community modelling’ framework that combines both data integration and community modelling to improve inferences on species‐ and community‐level dynamics.

    We illustrate the framework with a series of worked examples. Our three case studies demonstrate how integrated community models can be used to extend the geographic scope when evaluating species distributions and community‐level richness patterns; discern population and community trends over time; and estimate demographic rates and population growth for communities of sympatric species. We implemented these worked examples using multiple software methods through the R platform via packages with formula‐based interfaces and through development of custom code in JAGS, NIMBLE and Stan.

    Integrated community models provide an exciting approach to model biological and observational processes for multiple species using multiple data types and sources simultaneously, thus accounting for uncertainty and sampling error within a unified framework. By leveraging the combined benefits of both data integration and community modelling, integrated community models can produce valuable information about both common and rare species as well as community‐level dynamics, allowing for holistic evaluation of the effects of global change on biodiversity.

     
    more » « less
  2. Abstract

    Correlative species distribution models are widely used to quantify past shifts in ranges or communities, and to predict future outcomes under ongoing global change. Practitioners confront a wide range of potentially plausible models for ecological dynamics, but most specific applications only consider a narrow set. Here, we clarify that certain model structures can embed restrictive assumptions about key sources of forecast uncertainty into an analysis. To evaluate forecast uncertainties and our ability to explain community change, we fit and compared 39 candidate multi‐ or joint species occupancy models to avian incidence data collected at 320 sites across California during the early 20th century and resurveyed a century later. We found massive (>20,000 LOOIC) differences in within‐time information criterion across models. Poorer fitting models omitting multivariate random effects predicted less variation in species richness changes and smaller contemporary communities, with considerable variation in predicted spatial patterns in richness changes across models. The top models suggested avian environmental associations changed across time, contemporary avian occupancy was influenced by previous site‐specific occupancy states, and that both latent site variables and species associations with these variables also varied over time. Collectively, our results recapitulate that simplified model assumptions not only impact predictive fit but may mask important sources of forecast uncertainty and mischaracterize the current state of system understanding when seeking to describe or project community responses to global change. We recommend that researchers seeking to make long‐term forecasts prioritize characterizing forecast uncertainty over seeking to present a single best guess. To do so reliably, we urge practitioners to employ models capable of characterizing the key sources of forecast uncertainty, where predictors, parameters and random effects may vary over time or further interact with previous occurrence states.

     
    more » « less
  3. Abstract

    Effective conservation requires understanding species’ abundance patterns and demographic rates across space and time. Ideally, such knowledge should be available for whole communities because variation in species’ dynamics can elucidate factors leading to biodiversity losses. However, collecting data to simultaneously estimate abundance and demographic rates of communities of species is often prohibitively time intensive and expensive. We developed a multispecies dynamicN‐occupancy model to estimate unbiased, community‐wide relative abundance and demographic rates. In this model, detection–nondetection data (e.g., repeated presence–absence surveys) are used to estimate species‐ and community‐level parameters and the effects of environmental factors. To validate our model, we conducted a simulation study to determine how and when such an approach can be valuable and found that our multispecies model outperformed comparable single‐species models in estimating abundance and demographic rates in many cases. Using data from a network of camera traps across tropical equatorial Africa, we then used our model to evaluate the statuses and trends of a forest‐dwelling antelope community. We estimated relative abundance, rates of recruitment (i.e., reproduction and immigration), and apparent survival probabilities for each species’ local population. The antelope community was fairly stable (although 17% of populations [species–park combinations] declined over the study period). Variation in apparent survival was linked more closely to differences among national parks than to individual species’ life histories. The multispecies dynamicN‐occupancy model requires only detection–nondetection data to evaluate the population dynamics of multiple sympatric species and can thus be a valuable tool for examining the reasons behind recent biodiversity loss.

     
    more » « less
  4. Abstract

    Integrated community models—an emerging framework in which multiple data sources for multiple species are analyzed simultaneously—offer opportunities to expand inferences beyond the single‐species and single‐data‐source approaches common in ecology. We developed a novel integrated community model that combines distance sampling and single‐visit count data; within the model, information is shared among data sources (via a joint likelihood) and species (via a random‐effects structure) to estimate abundance patterns across a community. Parameters relating to abundance are shared between data sources, and the model can specify either shared or separate observation processes for each data source. Simulations demonstrated that the model provided unbiased estimates of abundance and detection parameters even when detection probabilities varied between the data types. The integrated community model also provided more accurate and more precise parameter estimates than alternative single‐species and single‐data‐source models in many instances. We applied the model to a community of 11 herbivore species in the Masai Mara National Reserve, Kenya, and found considerable interspecific variation in response to local wildlife management practices: Five species showed higher abundances in a region with passive conservation enforcement (median across species: 4.5× higher), three species showed higher abundances in a region with active conservation enforcement (median: 3.9× higher), and the remaining three species showed no abundance differences between the two regions. Furthermore, the community average of abundance was slightly higher in the region with active conservation enforcement but not definitively so (posterior mean: higher by 0.20 animals; 95% credible interval: 1.43 fewer animals, 1.86 more animals). Our integrated community modeling framework has the potential to expand the scope of inference over space, time, and levels of biological organization, but practitioners should carefully evaluate whether model assumptions are met in their systems and whether data integration is valuable for their applications.

     
    more » « less
  5. Abstract Aim

    Species occurrence data are valuable information that enables one to estimate geographical distributions, characterize niches and their evolution, and guide spatial conservation planning. Rapid increases in species occurrence data stem from increasing digitization and aggregation efforts, and citizen science initiatives. However, persistent quality issues in occurrence data can impact the accuracy of scientific findings, underscoring the importance of filtering erroneous occurrence records in biodiversity analyses.

    Innovation

    We introduce an R package, occTest, that synthesizes a growing open‐source ecosystem of biodiversity cleaning workflows to prepare occurrence data for different modelling applications. It offers a structured set of algorithms to identify potential problems with species occurrence records by employing a hierarchical organization of multiple tests. The workflow has a hierarchical structure organized in testPhases(i.e. cleaning vs. testing)that encompass different testBlocksgrouping differenttestTypes(e.g.environmental outlier detection), which may use differenttestMethods(e.g.Rosner test, jacknife,etc.). Four differenttestBlockscharacterize potential problems in geographic, environmental, human influence and temporal dimensions. Filtering and plotting functions are incorporated to facilitate the interpretation of tests. We provide examples with different data sources, with default and user‐defined parameters. Compared to other available tools and workflows, occTest offers a comprehensive suite of integrated tests, and allows multiple methods associated with each test to explore consensus among data cleaning methods. It uniquely incorporates both coordinate accuracy analysis and environmental analysis of occurrence records. Furthermore, it provides a hierarchical structure to incorporate future tests yet to be developed.

    Main conclusions

    occTest will help users understand the quality and quantity of data available before the start of data analysis, while also enabling users to filter data using either predefined rules or custom‐built rules. As a result, occTest can better assess each record's appropriateness for its intended application.

     
    more » « less