skip to main content


This content will become publicly available on February 1, 2025

Title: The influence of the number and distribution of background points in presence-background species distribution models
Species distribution models (SDMs), which relate recorded observations (presences) and absences or background points to environmental characteristics, are powerful tools used to generate hypotheses about the biogeography, ecology, and conservation of species. Although many researchers have examined the effects of presence and background point distributions on model outputs, they have not systematically evaluated the effects of various methods of background point sampling on the performance of a single model algorithm across many species. Therefore, a consensus on the preferred methods of background point sampling is lacking. Here, we conducted presence-background SDMs for 20 vertebrate species in North America under a variety of background point conditions, varying the number of background points used, the size of the buffer used to constrain the background points around the occurrences, and the percentage of background points sampled within the buffer (“spatial weighting”). We evaluated the accuracy and transferability of the models using Boyce index, overlap with expert-generated range maps, and area overpredicted and underpredicted by the SDM (and AUC for comparability with other studies). SDM performance is highly dependent on the species modelled but is affected by the number and spread of background points. Models with little spatial weighting had high accuracy (overlap values), but extreme extrapolation errors and overprediction. In contrast, SDMs with high transferability (high Boyce index values and low overprediction) had moderate-to-high spatial weighting. These results emphasize the importance of both background points and evaluation metric selection in SDMs. For other, more successful metrics, using many background points with spatial weighting may be preferred for models with large extents. These results can assist researchers in selecting the background point parameters most relevant for their research question, allowing them to fine-tune their hypotheses on the distribution of species through space and time.  more » « less
Award ID(s):
1945013
NSF-PAR ID:
10488887
Author(s) / Creator(s):
; ;
Publisher / Repository:
Elsevier B.V.
Date Published:
Journal Name:
Ecological Modelling
Volume:
488
Issue:
C
ISSN:
0304-3800
Page Range / eLocation ID:
110604
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    The area under the curve (AUC) of the receiving‐operating characteristic (or certain modifications of it) is almost universally used to assess the performance of species distribution models (SDMs), despite the well‐recognized problems encountered with this approach, mainly present when dealing with presence‐only data.

    We present a probabilistic treatment of the presence‐only problem and derive a method to assess the performance of SDMs based on the analysis of an area‐presence plot and the SDM outputs represented in both geographic and environmental spaces.

    We show how our method is useful to solve the two main tasks for which the AUC is used: assessing the performance of an SDM and comparing the performance of different SDMs. Our results build on previous work and constitute a rigorous method for assessing the performance of SDMs in relation to a random classifier.

    We establish comparisons with two of the most popular approaches used to assess the performance of an SDM, the AUC and the Boyce index, and identified cases in which our method has advantages over these two approaches.

    We suggest that the performance of an algorithm that classifies presence‐only data can be assessed by two factors: (a) the degree of non‐randomness of the classification at every step in the accumulation curve of presences, and (b) the amount of uninformative niche space used for the classification. The method we developed can be applied to any SDM output by using the R functions available at:https://github.com/LauraJim/SDM‐hyperTest.

     
    more » « less
  2. Abstract Aim

    Species distribution models (SDMs) that integrate presence‐only and presence–absence data offer a promising avenue to improve information on species' geographic distributions. The use of such ‘integrated SDMs’ on a species range‐wide extent has been constrained by the often limited presence–absence data and by the heterogeneous sampling of the presence‐only data. Here, we evaluate integrated SDMs for studying species ranges with a novel expert range map‐based evaluation. We build new understanding about how integrated SDMs address issues of estimation accuracy and data deficiency and thereby offer advantages over traditional SDMs.

    Location

    South and Central America.

    Time Period

    1979–2017.

    Major Taxa Studied

    Hummingbirds.

    Methods

    We build integrated SDMs by linking two observation models – one for each data type – to the same underlying spatial process. We validate SDMs with two schemes: (i) cross‐validation with presence–absence data and (ii) comparison with respect to the species' whole range as defined with IUCN range maps. We also compare models relative to the estimated response curves and compute the association between the benefit of the data integration and the number of presence records in each data set.

    Results

    The integrated SDM accounting for the spatially varying sampling intensity of the presence‐only data was one of the top performing models in both model validation schemes. Presence‐only data alleviated overly large niche estimates, and data integration was beneficial compared to modelling solely presence‐only data for species which had few presence points when predicting the species' whole range. On the community level, integrated models improved the species richness prediction.

    Main Conclusions

    Integrated SDMs combining presence‐only and presence–absence data are successfully able to borrow strengths from both data types and offer improved predictions of species' ranges. Integrated SDMs can potentially alleviate the impacts of taxonomically and geographically uneven sampling and to leverage the detailed sampling information in presence–absence data.

     
    more » « less
  3. Summary

    Anthropogenetic climate change has caused range shifts among many species. Species distribution models (SDMs) are used to predict how species ranges may change in the future. However, most SDMs rarely consider how climate‐sensitive traits, such as phenology, which affect individuals' demography and fitness, may influence species' ranges.

    Using > 120 000 herbarium specimens representing 360 plant species distributed across the eastern United States, we developed a novel ‘phenology‐informed’ SDM that integrates phenological responses to changing climates. We compared the ranges of each species forecast by the phenology‐informed SDM with those from conventional SDMs. We further validated the modeling approach using hindcasting.

    When examining the range changes of all species, our phenology‐informed SDMs forecast less species loss and turnover under climate change than conventional SDMs. These results suggest that dynamic phenological responses of species may help them adjust their ecological niches and persist in their habitats as the climate changes.

    Plant phenology can modulate species' responses to climate change, mitigating its negative effects on species persistence. Further application of our framework will contribute to a generalized understanding of how traits affect species distributions along environmental gradients and facilitate the use of trait‐based SDMs across spatial and taxonomic scales.

     
    more » « less
  4. Species distribution and ecological niche models (hereafter SDMs) are popular tools with broad applications in ecology, biodiversity conservation, and environmental science. Many SDM applications require projecting models in environmental conditions non‐analog to those used for model training (extrapolation), giving predictions that may be statistically unsupported and biologically meaningless. We introduce a novel method, Shape, a model‐agnostic approach that calculates the extrapolation degree for a given projection data point by its multivariate distance to the nearest training data point. Such distances are relativized by a factor that reflects the dispersion of the training data in environmental space. Distinct from other approaches, Shape incorporates an adjustable threshold to control the binary discrimination between acceptable and unacceptable extrapolation degrees. We compared Shape's performance to five extrapolation metrics based on their ability to detect analog environmental conditions in environmental space and improve SDMs suitability predictions. To do so, we used 760 virtual species to define different modeling conditions determined by species niche tolerance, distribution equilibrium condition, sample size, and algorithm. All algorithms had trouble predicting species niches. However, we found a substantial improvement in model predictions when model projections were truncated independently of extrapolation metrics. Shape's performance was dependent on extrapolation threshold used to truncate models. Because of this versatility, our approach showed similar or better performance than the previous approaches and could better deal with all modeling conditions and algorithms. Our extrapolation metric is simple to interpret, captures the complex shapes of the data in environmental space, and can use any extrapolation threshold to define whether model predictions are retained based on the extrapolation degrees. These properties make this approach more broadly applicable than existing methods for creating and applying SDMs. We hope this method and accompanying tools support modelers to explore, detect, and reduce extrapolation errors to achieve more reliable models.

    Keywords: environmental novelty, extrapolation, Mahalanobis distance, model prediction, non‐analog environmental data, transferability

     
    more » « less
  5. Dainton, John (Ed.)

    Improving models of species' distributions is essential for conservation, especially in light of global change. Species distribution models (SDMs) often rely on mean environmental conditions, yet species distributions are also a function of environmental heterogeneity and filtering acting at multiple spatial scales. Geodiversity, which we define as the variation of abiotic features and processes of Earth's entire geosphere (inclusive of climate), has potential to improve SDMs and conservation assessments, as they capture multiple abiotic dimensions of species niches, however they have not been sufficiently tested in SDMs. We tested a range of geodiversity variables computed at varying scales using climate and elevation data. We compared predictive performance of MaxEnt SDMs generated using CHELSA bioclimatic variables to those also including geodiversity variables for 31 mammalian species in Colombia. Results show the spatial grain of geodiversity variables affects SDM performance. Some variables consistently exhibited an increasing or decreasing trend in variable importance with spatial grain, showing slight scale-dependence and indicating that some geodiversity variables are more relevant at particular scales for some species. Incorporating geodiversity variables into SDMs, and doing so at the appropriate spatial scales, enhances the ability to model species-environment relationships, thereby contributing to the conservation and management of biodiversity.

    This article is part of the Theo Murphy meeting issue ‘Geodiversity for science and society’.

     
    more » « less