skip to main content


Title: Predicting species distributions and community composition using satellite remote sensing predictors
Abstract Biodiversity is rapidly changing due to changes in the climate and human related activities; thus, the accurate predictions of species composition and diversity are critical to developing conservation actions and management strategies. In this paper, using satellite remote sensing products as covariates, we constructed stacked species distribution models (S-SDMs) under a Bayesian framework to build next-generation biodiversity models. Model performance of these models was assessed using oak assemblages distributed across the continental United States obtained from the National Ecological Observatory Network (NEON). This study represents an attempt to evaluate the integrated predictions of biodiversity models—including assemblage diversity and composition—obtained by stacking next-generation SDMs. We found that applying constraints to assemblage predictions, such as using the probability ranking rule, does not improve biodiversity prediction models. Furthermore, we found that independent of the stacking procedure (bS-SDM versus pS-SDM versus cS-SDM), these kinds of next-generation biodiversity models do not accurately recover the observed species composition at the plot level or ecological-community scales (NEON plots are 400 m 2 ). However, these models do return reasonable predictions at macroecological scales, i.e., moderately to highly correct assignments of species identities at the scale of NEON sites (mean area ~ 27 km 2 ). Our results provide insights for advancing the accuracy of prediction of assemblage diversity and composition at different spatial scales globally. An important task for future studies is to evaluate the reliability of combining S-SDMs with direct detection of species using image spectroscopy to build a new generation of biodiversity models that accurately predict and monitor ecological assemblages through time and space.  more » « less
Award ID(s):
2021898 2017843 1745562
NSF-PAR ID:
10290521
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Scientific Reports
Volume:
11
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    As geographic range estimates for the IUCN Red List guide conservation actions, accuracy and ecological realism are crucial. IUCN’s extent of occurrence (EOO) is the general region including the species’ range, while area of occupancy (AOO) is the subset of EOO occupied by the species. Data‐poor species with incomplete sampling present particular difficulties, but species distribution models (SDMs) can be used to predict suitable areas. Nevertheless, SDMs typically employ abiotic variables (i.e., climate) and do not explicitly account for biotic interactions that can impose range constraints. We sought to improve range estimates for data‐poor, parapatric species by masking out areas under inferred competitive exclusion. We did so for two South American spiny pocket mice:Heteromys australis(Least Concern) andHeteromys teleus(Vulnerable due to especially poor sampling), whose ranges appear restricted by competition. For both species, we estimated EOO using SDMs and AOO with four approaches: occupied grid cells, abiotic SDM prediction, and this prediction masked by approximations of the areas occupied by each species’ congener. We made the masks using support vector machines (SVMs) fit with two data types: occurrence coordinates alone; and coordinates along with SDM predictions of suitability. Given the uncertainty in calculating AOO for low‐data species, we made estimates for the lower and upper bounds for AOO, but only make recommendations forH. teleusas its full known range was considered. The SVM approaches (especially the second one) had lower classification error and made more ecologically realistic delineations of the contact zone. ForH. teleus, the lower AOO bound (a strongly biased underestimate) corresponded to Endangered (occupied grid cells), while the upper bounds (other approaches) led to Near Threatened. As we currently lack data to determine the species’ true occupancy within the post‐processed SDM prediction, we recommend that an updated listing forH. teleusinclude these bounds for AOO. This study advances methods for estimating the upper bound of AOO and highlights the need for better ways to produce unbiased estimates of lower bounds. More generally, the SVM approaches for post‐processing SDM predictions hold promise for improving range estimates for other uses in biogeography and conservation.

     
    more » « less
  2. Abstract

    Species' ranges are changing at accelerating rates. Species distribution models (SDMs) are powerful tools that help rangers and decision‐makers prepare for reintroductions, range shifts, reductions and/or expansions by predicting habitat suitability across landscapes. Yet, range‐expanding or ‐shifting species in particular face other challenges that traditional SDM procedures cannot quantify, due to large differences between a species' currently occupied range and potential future range. The realism of SDMs is thus lost and not as useful for conservation management in practice. Here, we address these challenges with an extended assessment of habitat suitability through anintegrated SDM database(iSDMdb).

    TheiSDMdbis a spatial database of predicted sites in a species' prediction range, derived from SDM results, and is a single spatial feature that contains additional, user‐friendly data fields that synthesise and summarise SDM predictions and uncertainty, human impacts, restoration features, novel preferences in novel spaces and management priorities. To illustrate its utility, we used the endangered New Zealand sea lionPhocarctos hookeri. We consulted with wildlife rangers, decision‐makers and sea lion experts to supplement SDM predictions with additional, more realistic and applicable information for management.

    Almost half the data fields included in this database resulted from engaging with these end‐users during our study. The SDM found 395 predicted sites. However, theiSDMdb's additional assessments showed that the actual suitability of most sites (90%) was questionable due to human impacts. >50% of sites contained unnatural barriers (fences, grazing grasslands), and 75% of sites had roads located within the species' range of inland movement. Just 5% of the predicted sites were mostly (>80%) protected.

    Integrating SDM results with supplemental assessments provides a way to address SDM limitations, especially for range‐expanding or ‐shifting species. SDM products for conservation applications have been critiqued for lacking transparency and interpretation support, and ineffectively communicating uncertainty. TheiSDMdbaddresses these issues and enhances the practical relevance and utility of SDMs for stakeholders, rangers and decision‐makers. We exemplify how to build aniSDMdbusing open‐source tools, and how to make diverse, complex assessments more accessible for end‐users.

     
    more » « less
  3. Abstract Aim

    Species distribution models (SDMs) that integrate presence‐only and presence–absence data offer a promising avenue to improve information on species' geographic distributions. The use of such ‘integrated SDMs’ on a species range‐wide extent has been constrained by the often limited presence–absence data and by the heterogeneous sampling of the presence‐only data. Here, we evaluate integrated SDMs for studying species ranges with a novel expert range map‐based evaluation. We build new understanding about how integrated SDMs address issues of estimation accuracy and data deficiency and thereby offer advantages over traditional SDMs.

    Location

    South and Central America.

    Time Period

    1979–2017.

    Major Taxa Studied

    Hummingbirds.

    Methods

    We build integrated SDMs by linking two observation models – one for each data type – to the same underlying spatial process. We validate SDMs with two schemes: (i) cross‐validation with presence–absence data and (ii) comparison with respect to the species' whole range as defined with IUCN range maps. We also compare models relative to the estimated response curves and compute the association between the benefit of the data integration and the number of presence records in each data set.

    Results

    The integrated SDM accounting for the spatially varying sampling intensity of the presence‐only data was one of the top performing models in both model validation schemes. Presence‐only data alleviated overly large niche estimates, and data integration was beneficial compared to modelling solely presence‐only data for species which had few presence points when predicting the species' whole range. On the community level, integrated models improved the species richness prediction.

    Main Conclusions

    Integrated SDMs combining presence‐only and presence–absence data are successfully able to borrow strengths from both data types and offer improved predictions of species' ranges. Integrated SDMs can potentially alleviate the impacts of taxonomically and geographically uneven sampling and to leverage the detailed sampling information in presence–absence data.

     
    more » « less
  4. Abstract

    Species distribution models (SDMs) that rely on regional‐scale environmental variables will play a key role in forecasting species occurrence in the face of climate change. However, in the Anthropocene, a number of local‐scale anthropogenic variables, including wildfire history, land‐use change, invasive species, and ecological restoration practices can override regional‐scale variables to drive patterns of species distribution. Incorporating these human‐induced factors into SDMs remains a major research challenge, in part because spatial variability in these factors occurs at fine scales, rendering prediction over regional extents problematic. Here, we used big sagebrush (Artemisia tridentataNutt.) as a model species to explore whether including human‐induced factors improves the fit of the SDM. We applied a Bayesian hurdle spatial approach using 21,753 data points of field‐sampled vegetation obtained from the LANDFIRE program to model sagebrush occurrence and cover by incorporating fire history metrics and restoration treatments from 1980 to 2015 throughout the Great Basin of North America. Models including fire attributes and restoration treatments performed better than those including only climate and topographic variables. Number of fires and fire occurrence had the strongest relative effects on big sagebrush occurrence and cover, respectively. The models predicted that the probability of big sagebrush occurrence decreases by 1.2% (95% CI: −6.9%, 0.6%) when one fire occurs and cover decreases by 44.7% (95% CI: −47.9%, −41.3%) if at least one fire occurred over the 36 year period of record. Restoration practices increased the probability of big sagebrush occurrence but had minimal effect on cover. Our results demonstrate the potential value of including disturbance and land management along with climate in models to predict species distributions. As an increasing number of datasets representing land‐use history become available, we anticipate that our modeling framework will have broad relevance across a range of biomes and species.

     
    more » « less
  5. null (Ed.)
    Accurately mapping tree species composition and diversity is a critical step towards spatially explicit and species-specific ecological understanding. The National Ecological Observatory Network (NEON) is a valuable source of open ecological data across the United States. Freely available NEON data include in-situ measurements of individual trees, including stem locations, species, and crown diameter, along with the NEON Airborne Observation Platform (AOP) airborne remote sensing imagery, including hyperspectral, multispectral, and light detection and ranging (LiDAR) data products. An important aspect of predicting species using remote sensing data is creating high-quality training sets for optimal classification purposes. Ultimately, manually creating training data is an expensive and time-consuming task that relies on human analyst decisions and may require external data sets or information. We combine in-situ and airborne remote sensing NEON data to evaluate the impact of automated training set preparation and a novel data preprocessing workflow on classifying the four dominant subalpine coniferous tree species at the Niwot Ridge Mountain Research Station forested NEON site in Colorado, USA. We trained pixel-based Random Forest (RF) machine learning models using a series of training data sets along with remote sensing raster data as descriptive features. The highest classification accuracies, 69% and 60% based on internal RF error assessment and an independent validation set, respectively, were obtained using circular tree crown polygons created with half the maximum crown diameter per tree. LiDAR-derived data products were the most important features for species classification, followed by vegetation indices. This work contributes to the open development of well-labeled training data sets for forest composition mapping using openly available NEON data without requiring external data collection, manual delineation steps, or site-specific parameters. 
    more » « less