skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Clustering Species With Residual Covariance Matrix in Joint Species Distribution Models
Modeling species distributions over space and time is one of the major research topics in both ecology and conservation biology. Joint Species Distribution models (JSDMs) have recently been introduced as a tool to better model community data, by inferring a residual covariance matrix between species, after accounting for species' response to the environment. However, these models are computationally demanding, even when latent factors, a common tool for dimension reduction, are used. To address this issue, Taylor-Rodriguez et al. ( 2017 ) proposed to use a Dirichlet process, a Bayesian nonparametric prior, to further reduce model dimension by clustering species in the residual covariance matrix. Here, we built on this approach to include a prior knowledge on the potential number of clusters, and instead used a Pitman–Yor process to address some critical limitations of the Dirichlet process. We therefore propose a framework that includes prior knowledge in the residual covariance matrix, providing a tool to analyze clusters of species that share the same residual associations with respect to other species. We applied our methodology to a case study of plant communities in a protected area of the French Alps (the Bauges Regional Park), and demonstrated that our extensions improve dimension reduction and reveal additional information from the residual covariance matrix, notably showing how the estimated clusters are compatible with plant traits, endorsing their importance in shaping communities.  more » « less
Award ID(s):
1754443
PAR ID:
10232893
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Ecology and Evolution
Volume:
9
ISSN:
2296-701X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Many management and conservation contexts can benefit from understanding relationships between species abundances, which can be used to improve predictions of species occurrence and abundance.We present conditional prediction as a tool to capture information about species abundances via residual covariance between species. From a fitted joint species distribution model, this framework produces a species coefficient matrix that contains relationships between species abundances. The species coefficients allow co‐observed species to be treated as a second set of predictors supplementing covariates in the model to improve prediction. We use simulations to demonstrate the potential benefits and limitations of conditional prediction across data types and species covariance before applying conditional prediction to two management contexts with real data.Simulations demonstrate that conditional prediction provides the largest benefits to continuous data and when there is residual covariance between many species.In our first application, we show that conditioning on other species improves in‐sample and out‐of‐sample predictions of fish and invertebrate species, including Atlantic cod. In our second application, we show that the species coefficient matrix can be used to identify bird species at risk of nest parasitism by Brown‐headed Cowbirds.Synthesis and applications. We present guidelines for using conditional prediction, which can help understand relationships between species abundances, improve predictions and inform conservation in a variety of contexts. 
    more » « less
  2. Abstract Determining the spatial distributions of species and communities is a key task in ecology and conservation efforts. Joint species distribution models are a fundamental tool in community ecology that use multi‐species detection–nondetection data to estimate species distributions and biodiversity metrics. The analysis of such data is complicated by residual correlations between species, imperfect detection, and spatial autocorrelation. While many methods exist to accommodate each of these complexities, there are few examples in the literature that address and explore all three complexities simultaneously. Here we developed a spatial factor multi‐species occupancy model to explicitly account for species correlations, imperfect detection, and spatial autocorrelation. The proposed model uses a spatial factor dimension reduction approach and Nearest Neighbor Gaussian Processes to ensure computational efficiency for data sets with both a large number of species (e.g., >100) and spatial locations (e.g., 100,000). We compared the proposed model performance to five alternative models, each addressing a subset of the three complexities. We implemented the proposed and alternative models in thespOccupancysoftware, designed to facilitate application via an accessible, well documented, and open‐source R package. Using simulations, we found that ignoring the three complexities when present leads to inferior model predictive performance, and the impacts of failing to account for one or more complexities will depend on the objectives of a given study. Using a case study on 98 bird species across the continental US, the spatial factor multi‐species occupancy model had the highest predictive performance among the alternative models. Our proposed framework, together with its implementation inspOccupancy, serves as a user‐friendly tool to understand spatial variation in species distributions and biodiversity while addressing common complexities in multi‐species detection–nondetection data. 
    more » « less
  3. Functional Principal Component Analysis (FPCA) has become a widely used dimension reduction tool for functional data analysis. When additional covariates are available, existing FPCA models integrate them either in the mean function or in both the mean function and the covariance function. However, methods of the first kind are not suitable for data that display second-order variation, while those of the second kind are time-consuming and make it difficult to perform subsequent statistical analyses on the dimension-reduced representations. To tackle these issues, we introduce an eigen-adjusted FPCA model that integrates covariates in the covariance function only through its eigenvalues. In particular, different structures on the covariate-specific eigenvalues—corresponding to different practical problems—are discussed to illustrate the model’s flexibility as well as utility. To handle functional observations under … 
    more » « less
  4. Background and aims Plant interactions with soil microbial communities are critical for understanding plant health, improving horticultural and agricultural outcomes, and maintaining diverse natural communities. In some cases, disease suppressive soils enhance plant survival in the presence of pathogens. However, species-specific differences and seasonal variation complicate our understanding of the drivers of soil fungal communities and their consequences for plants. Here, we aim to describe soil fungal communities across Rhododendron species and seasons and as well as the test for fungal indicators of species and seasons in the soil. Further, we tested for correlations between fungal community composition and prior experimental quantification of disease suppressive soils. Methods We conducted high throughput sequencing of the fungal communities found in soil collected under 14 Rhododendron species and across 2 seasons (April, October) at two sites in Ohio, USA. We described these soils and used phylogenetic analyses to ask whether fungal community composition correlated with increased plant survival with the addition of whole soil communities from a prior greenhouse experiment. Results We found effects of Rhododendron species and season on fungal communities. Fungal community composition correlated with survival following exposure to whole soil microbial communities, though this result depended on the presence of R. minus. We identified 45 Trichoderma taxa across our soil samples, and some Trichoderma were significantly associated with particular Rhododendron species in indicator species analyses. Conclusion The correlation between plant responses to soil biotic communities and fungal community composition, as well as the presence of potential beneficial taxa such as Trichoderma and mycorrhizal fungi, are consistent with fungal-mediated survival benefits from the pathogen Phytophthora cinnamomi. 
    more » « less
  5. Abstract Numerous modelling techniques exist to estimate abundance of plant and animal populations. The most accurate methods account for multiple complexities found in ecological data, such as observational biases, spatial autocorrelation, and species correlations. There is, however, a lack of user‐friendly and computationally efficient software to implement the various models, particularly for large data sets.We developed thespAbundance Rpackage for fitting spatially explicit Bayesian single‐species and multi‐species hierarchical distance sampling models, N‐mixture models, and generalized linear mixed models. The models within the package can account for spatial autocorrelation using Nearest Neighbour Gaussian Processes and accommodate species correlations in multi‐species models using a latent factor approach, which enables model fitting for data sets with large numbers of sites and/or species.We provide three vignettes and three case studies that highlightspAbundancefunctionality. We used spatially explicit multi‐species distance sampling models to estimate density of 16 bird species in Florida, USA, an N‐mixture model to estimate black‐throated blue warbler (Setophaga caerulescens) abundance in New Hampshire, USA, and a spatial linear mixed model to estimate forest above‐ground biomass across the continental USA.spAbundanceprovides a user‐friendly, formula‐based interface to fit a variety of univariate and multivariate spatially explicit abundance models. The package serves as a useful tool for ecologists and conservation practitioners to generate improved inference and predictions on the spatial drivers of abundance in populations and communities. 
    more » « less