skip to main content


Title: Clustering Species With Residual Covariance Matrix in Joint Species Distribution Models
Modeling species distributions over space and time is one of the major research topics in both ecology and conservation biology. Joint Species Distribution models (JSDMs) have recently been introduced as a tool to better model community data, by inferring a residual covariance matrix between species, after accounting for species' response to the environment. However, these models are computationally demanding, even when latent factors, a common tool for dimension reduction, are used. To address this issue, Taylor-Rodriguez et al. ( 2017 ) proposed to use a Dirichlet process, a Bayesian nonparametric prior, to further reduce model dimension by clustering species in the residual covariance matrix. Here, we built on this approach to include a prior knowledge on the potential number of clusters, and instead used a Pitman–Yor process to address some critical limitations of the Dirichlet process. We therefore propose a framework that includes prior knowledge in the residual covariance matrix, providing a tool to analyze clusters of species that share the same residual associations with respect to other species. We applied our methodology to a case study of plant communities in a protected area of the French Alps (the Bauges Regional Park), and demonstrated that our extensions improve dimension reduction and reveal additional information from the residual covariance matrix, notably showing how the estimated clusters are compatible with plant traits, endorsing their importance in shaping communities.  more » « less
Award ID(s):
1754443
NSF-PAR ID:
10232893
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Ecology and Evolution
Volume:
9
ISSN:
2296-701X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Determining the spatial distributions of species and communities is a key task in ecology and conservation efforts. Joint species distribution models are a fundamental tool in community ecology that use multi‐species detection–nondetection data to estimate species distributions and biodiversity metrics. The analysis of such data is complicated by residual correlations between species, imperfect detection, and spatial autocorrelation. While many methods exist to accommodate each of these complexities, there are few examples in the literature that address and explore all three complexities simultaneously. Here we developed a spatial factor multi‐species occupancy model to explicitly account for species correlations, imperfect detection, and spatial autocorrelation. The proposed model uses a spatial factor dimension reduction approach and Nearest Neighbor Gaussian Processes to ensure computational efficiency for data sets with both a large number of species (e.g., >100) and spatial locations (e.g., 100,000). We compared the proposed model performance to five alternative models, each addressing a subset of the three complexities. We implemented the proposed and alternative models in thespOccupancysoftware, designed to facilitate application via an accessible, well documented, and open‐source R package. Using simulations, we found that ignoring the three complexities when present leads to inferior model predictive performance, and the impacts of failing to account for one or more complexities will depend on the objectives of a given study. Using a case study on 98 bird species across the continental US, the spatial factor multi‐species occupancy model had the highest predictive performance among the alternative models. Our proposed framework, together with its implementation inspOccupancy, serves as a user‐friendly tool to understand spatial variation in species distributions and biodiversity while addressing common complexities in multi‐species detection–nondetection data.

     
    more » « less
  2. Summary

    In functional data analysis, curves or surfaces are observed, up to measurement error, at a finite set of locations, for, say, a sample of n individuals. Often, the curves are homogeneous, except perhaps for individual-specific regions that provide heterogeneous behaviour (e.g. ‘damaged’ areas of irregular shape on an otherwise smooth surface). Motivated by applications with functional data of this nature, we propose a Bayesian mixture model, with the aim of dimension reduction, by representing the sample of n curves through a smaller set of canonical curves. We propose a novel prior on the space of probability measures for a random curve which extends the popular Dirichlet priors by allowing local clustering: non-homogeneous portions of a curve can be allocated to different clusters and the n individual curves can be represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior proposed envisions a conceptual hidden factor with k-levels that acts locally on each curve. We discuss several models incorporating this prior and illustrate its performance with simulated and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet mixtures, specifically, their behaviour as the number of the mixture components goes to ∞ and their connection with Dirichlet process mixtures.

     
    more » « less
  3. Abstract

    Understanding how communities respond to perturbations requires us to consider not only changes in the abundance of individual species but also correlated changes that can emerge through interspecific effects. However, our knowledge of this phenomenon is mostly constrained to situations where interspecific effects are fixed. Here, we introduce a framework to disentangle the impact of species correlated responses on community sensitivity to perturbations when interspecific effects change over time due to cyclic or chaotic population dynamics. We partition the volume expansion rate of perturbed abundances (community sensitivity) into contributions of individual species and of species correlated responses by converting the time‐varying Jacobian matrix containing interspecific effects into a time‐varying covariance matrix. Using population dynamics models, we demonstrate that species correlated responses change considerably across time and continuously alternate between reducing and having no impact on community sensitivity. Importantly, these alternating impacts depend on the abundance of particular species and can be detected even from noisy time series. We showcase our framework using two experimental predator–prey time series and find that the impact of species correlated responses is modulated by prey abundance—as theoretically expected. Our results provide new insights into how and when species interactions can dampen community sensitivity when abundances fluctuate over time.

     
    more » « less
  4. The criminogenic dimensions of conservation are highly relevant to contemporary protected area management. Research on crime target suitability in the field of criminology has built new understanding regarding how the characteristics of the crime targets affect their suitability for being targeted by offenders. In the last decade, criminologists have sought to apply and adapt target suitability frameworks to explain wildlife related crimes. This study seeks to build upon the extant knowledge base and advance adaptation and application of target suitability research. First, we drew on research, fieldwork, and empirical evidence from conservation science to develop a poaching-stage model with a focus on live specimens or wild animals - rather than a market stage and wildlife product -focused target suitability model. Second, we collected data in the Intensive Protection Zone of Bukit Barisan Selatan National Park (BBSNP), Sumatra, Indonesia through surveys with local community members (n=400), and a three-day focus group with conservation practitioners (n= 25). Our target suitability model, IPOACHED, predicts that species that are in-demand , passive , obtainable , all-purpose , conflict-prone , hideable , extractable , and disposable are more suitable species for poaching and therefore more vulnerable. When applying our IPOACHED model, we find that the most common response to species characteristics that drive poaching in BBSNP was that they are in-demand , with support for cultural or symbolic value (n=101 of respondents, 25%), ecological value (n=164, 35%), and economic value (n=234, 59%). There was moderate support for the conflict-prone dimension of the IPOACHED model (n=70, 18%). Other factors, such as a species lack of passiveness , obtainability and extractability , hamper poaching regardless of value. Our model serves as an explanatory or predictive tool for understanding poaching within a conservation-based management unit (e.g., a protected area) rather than for a specific use market (e.g., pets). Conservation researchers and practitioners can use and adapt our model and survey instruments to help explain and predict poaching of species through the integration of knowledge and opinions from local communities and conservation professionals, with the ultimate goal of preventing wildlife poaching. 
    more » « less
  5. Determinantal point processes (DPPs) have recently become popular tools for modeling the phenomenon of negative dependence, or repulsion, in data. However, our understanding of an analogue of a classical parametric statistical theory is rather limited for this class of models. In this work, we investigate a parametric family of Gaussian DPPs with a clearly interpretable effect of parametric modulation on the observed points. We show that parameter modulation impacts the observed points by introducing directionality in their repulsion structure, and the principal directions correspond to the directions of maximal (i.e., the most long-ranged) dependency. This model readily yields a viable alternative to principal component analysis (PCA) as a dimension reduction tool that favors directions along which the data are most spread out. This methodological contribution is complemented by a statistical analysis of a spiked model similar to that employed for covariance matrices as a framework to study PCA. These theoretical investigations unveil intriguing questions for further examination in random matrix theory, stochastic geometry, and related topics.

     
    more » « less