NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Joint species distribution models with imperfect detection for high‐dimensional spatial data

https://doi.org/10.1002/ecy.4137

Doser, Jeffrey W.; Finley, Andrew O.; Banerjee, Sudipto (July 2023, Ecology)

Abstract Determining the spatial distributions of species and communities is a key task in ecology and conservation efforts. Joint species distribution models are a fundamental tool in community ecology that use multi‐species detection–nondetection data to estimate species distributions and biodiversity metrics. The analysis of such data is complicated by residual correlations between species, imperfect detection, and spatial autocorrelation. While many methods exist to accommodate each of these complexities, there are few examples in the literature that address and explore all three complexities simultaneously. Here we developed a spatial factor multi‐species occupancy model to explicitly account for species correlations, imperfect detection, and spatial autocorrelation. The proposed model uses a spatial factor dimension reduction approach and Nearest Neighbor Gaussian Processes to ensure computational efficiency for data sets with both a large number of species (e.g., >100) and spatial locations (e.g., 100,000). We compared the proposed model performance to five alternative models, each addressing a subset of the three complexities. We implemented the proposed and alternative models in thespOccupancysoftware, designed to facilitate application via an accessible, well documented, and open‐source R package. Using simulations, we found that ignoring the three complexities when present leads to inferior model predictive performance, and the impacts of failing to account for one or more complexities will depend on the objectives of a given study. Using a case study on 98 bird species across the continental US, the spatial factor multi‐species occupancy model had the highest predictive performance among the alternative models. Our proposed framework, together with its implementation inspOccupancy, serves as a user‐friendly tool to understand spatial variation in species distributions and biodiversity while addressing common complexities in multi‐species detection–nondetection data.
more » « less
Conjugate sparse plus low rank models for efficient Bayesian interpolation of large spatial data

https://doi.org/10.1002/env.2748

Shirota, Shinichiro; Finley, Andrew_O; Cook, Bruce_D; Banerjee, Sudipto (August 2022, Environmetrics)

Abstract A key challenge in spatial data science is the analysis for massive spatially‐referenced data sets. Such analyses often proceed from Gaussian process specifications that can produce rich and robust inference, but involve dense covariance matrices that lack computationally exploitable structures. Recent developments in spatial statistics offer a variety of massively scalable approaches. Bayesian inference and hierarchical models, in particular, have gained popularity due to their richness and flexibility in accommodating spatial processes. Our current contribution is to provide computationally efficient exact algorithms for spatial interpolation of massive data sets using scalable spatial processes. We combine low‐rank Gaussian processes with efficient sparse approximations. Following recent work by Zhang et al. (2019), we model the low‐rank process using a Gaussian predictive process (GPP) and the residual process as a sparsity‐inducing nearest‐neighbor Gaussian process (NNGP). A key contribution here is to implement these models using exact conjugate Bayesian modeling to avoid expensive iterative algorithms. Through the simulation studies, we evaluate performance of the proposed approach and the robustness of our models, especially for long range prediction. We implement our approaches for remotely sensed light detection and ranging (LiDAR) data collected over the US Forest Service Tanana Inventory Unit (TIU) in a remote portion of Interior Alaska.
more » « less
spOccupancy: An R package for single‐species, multi‐species, and integrated spatial occupancy models

https://doi.org/10.1111/2041-210X.13897

Doser, Jeffrey W.; Finley, Andrew O.; Kéry, Marc; Zipkin, Elise F. (May 2022, Methods in Ecology and Evolution)

Abstract Occupancy modelling is a common approach to assess species distribution patterns, while explicitly accounting for false absences in detection–nondetection data. Numerous extensions of the basic single‐species occupancy model exist to model multiple species, spatial autocorrelation and to integrate multiple data types. However, development of specialized and computationally efficient software to incorporate such extensions, especially for large datasets, is scarce or absent.We introduce thespOccupancy Rpackage designed to fit single‐species and multi‐species spatially explicit occupancy models. We fit all models within a Bayesian framework using Pólya‐Gamma data augmentation, which results in fast and efficient inference.spOccupancyprovides functionality for data integration of multiple single‐species detection–nondetection datasets via a joint likelihood framework. The package leverages Nearest Neighbour Gaussian Processes to account for spatial autocorrelation, which enables spatially explicit occupancy modelling for potentially massive datasets (e.g. 1,000s–100,000s of sites).spOccupancyprovides user‐friendly functions for data simulation, model fitting, model validation (by posterior predictive checks), model comparison (using information criteria and k‐fold cross‐validation) and out‐of‐sample prediction. We illustrate the package's functionality via a vignette, simulated data analysis and two bird case studies.ThespOccupancypackage provides a user‐friendly platform to fit a variety of single and multi‐species occupancy models, making it straightforward to address detection biases and spatial autocorrelation in species distribution models even for large datasets.
more » « less
Integrating automated acoustic vocalization data and point count surveys for estimation of bird abundance

https://doi.org/10.1111/2041-210X.13578

Doser, Jeffrey W.; Finley, Andrew O.; Weed, Aaron S.; Zipkin, Elise F. (March 2021, Methods in Ecology and Evolution)

Abstract Monitoring wildlife abundance across space and time is an essential task to study their population dynamics and inform effective management. Acoustic recording units are a promising technology for efficiently monitoring bird populations and communities. While current acoustic data models provide information on the presence/absence of individual species, new approaches are needed to monitor population abundance, ideally across large spatio‐temporal regions.We present an integrated modelling framework that combines high‐quality but temporally sparse bird point count survey data with acoustic recordings. Our models account for imperfect detection in both data types and false positive errors in the acoustic data. Using simulations, we compare the accuracy and precision of abundance estimates using differing amounts of acoustic vocalizations obtained from a clustering algorithm, point count data, and a subset of manually validated acoustic vocalizations. We also use our modelling framework in a case study to estimate abundance of the Eastern Wood‐Pewee (Contopus virens) in Vermont, USA.The simulation study reveals that combining acoustic and point count data via an integrated model improves accuracy and precision of abundance estimates compared with models informed by either acoustic or point count data alone. Improved estimates are obtained across a wide range of scenarios, with the largest gains occurring when detection probability for the point count data is low. Combining acoustic data with only a small number of point count surveys yields estimates of abundance without the need for validating any of the identified vocalizations from the acoustic data. Within our case study, the integrated models provided moderate support for a decline of the Eastern Wood‐Pewee in this region.Our integrated modelling approach combines dense acoustic data with few point count surveys to deliver reliable estimates of species abundance without the need for manual identification of acoustic vocalizations or a prohibitively expensive large number of repeated point count surveys. Our proposed approach offers an efficient monitoring alternative for large spatio‐temporal regions when point count data are difficult to obtain or when monitoring is focused on rare species with low detection probability.
more » « less
Highly Scalable Bayesian Geostatistical Modeling via Meshed Gaussian Processes on Partitioned Domains

https://doi.org/10.1080/01621459.2020.1833889

Peruzzi, Michele; Banerjee, Sudipto; Finley, Andrew O. (April 2022, Journal of the American Statistical Association)

Full Text Available
spNNGP R Package for Nearest Neighbor Gaussian Process Models

https://doi.org/10.18637/jss.v103.i05

Finley, Andrew O.; Datta, Abhirup; Banerjee, Sudipto (January 2022, Journal of Statistical Software)

Full Text Available
Trends in bird abundance differ among protected forests but not bird guilds

https://doi.org/10.1002/eap.2377

Doser, Jeffrey W.; Weed, Aaron S.; Zipkin, Elise F.; Miller, Kathryn M.; Finley, Andrew O. (September 2021, Ecological Applications)

Full Text Available
Bayesian spatially varying coefficient models in the spBayes R package

https://doi.org/10.1016/j.envsoft.2019.104608

Finley, Andrew O.; Banerjee, Sudipto (March 2020, Environmental Modelling & Software)

Full Text Available
Efficient Algorithms for Bayesian Nearest Neighbor Gaussian Processes

https://doi.org/10.1080/10618600.2018.1537924

Finley, Andrew O.; Datta, Abhirup; Cook, Bruce D.; Morton, Douglas C.; Andersen, Hans E.; Banerjee, Sudipto (November 2018, Journal of Computational and Graphical Statistics)

Full Text Available

Search for: All records