skip to main content


Title: Integrating distance sampling and presence‐only data to estimate species abundance
Integrated models combine multiple data types within a unified analysis to estimate species abundance and covariate effects. By sharing biological parameters, integrated models improve the accuracy and precision of estimates compared to separate analyses of individual data sets. We developed an integrated point process model to combine presence-only and distance sampling data for estimation of spatially explicit abundance patterns. Simulations across a range of parameter values demonstrate that our model can recover estimates of biological covariates, but parameter accuracy and precision varied with the quantity of each data type. We applied our model to a case study of black-backed jackals in the Masai Mara National Reserve, Kenya, to examine effects of spatially varying covariates on jackal abundance patterns. The model revealed that jackals were positively affected by anthropogenic disturbance on the landscape, with highest abundance estimated along the Reserve border near human activity. We found minimal effects of landscape cover, lion density, and distance to water source, suggesting that human use of the Reserve may be the biggest driver of jackal abundance patterns. Our integrated model expands the scope of ecological inference by taking advantage of widely available presence-only data, while simultaneously leveraging richer, but typically limited, distance sampling data.  more » « less
Award ID(s):
1755089 1954406 1702635
NSF-PAR ID:
10206413
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Ecology
ISSN:
0012-9658
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Monitoring wildlife abundance across space and time is an essential task to study their population dynamics and inform effective management. Acoustic recording units are a promising technology for efficiently monitoring bird populations and communities. While current acoustic data models provide information on the presence/absence of individual species, new approaches are needed to monitor population abundance, ideally across large spatio‐temporal regions.

    We present an integrated modelling framework that combines high‐quality but temporally sparse bird point count survey data with acoustic recordings. Our models account for imperfect detection in both data types and false positive errors in the acoustic data. Using simulations, we compare the accuracy and precision of abundance estimates using differing amounts of acoustic vocalizations obtained from a clustering algorithm, point count data, and a subset of manually validated acoustic vocalizations. We also use our modelling framework in a case study to estimate abundance of the Eastern Wood‐Pewee (Contopus virens) in Vermont, USA.

    The simulation study reveals that combining acoustic and point count data via an integrated model improves accuracy and precision of abundance estimates compared with models informed by either acoustic or point count data alone. Improved estimates are obtained across a wide range of scenarios, with the largest gains occurring when detection probability for the point count data is low. Combining acoustic data with only a small number of point count surveys yields estimates of abundance without the need for validating any of the identified vocalizations from the acoustic data. Within our case study, the integrated models provided moderate support for a decline of the Eastern Wood‐Pewee in this region.

    Our integrated modelling approach combines dense acoustic data with few point count surveys to deliver reliable estimates of species abundance without the need for manual identification of acoustic vocalizations or a prohibitively expensive large number of repeated point count surveys. Our proposed approach offers an efficient monitoring alternative for large spatio‐temporal regions when point count data are difficult to obtain or when monitoring is focused on rare species with low detection probability.

     
    more » « less
  2. Abstract

    Riverscape genetics, which applies concepts in landscape genetics to riverine ecosystems, lack appropriate quantitative methods that address the spatial autocorrelation structure of linear stream networks and account for bidirectional geneflow. To address these challenges, we present a general framework for the design and analysis of riverscape genetic studies. Our framework starts with the estimation of pairwise genetic distance at sample sites and the development of a spatially structured ecological network (SSEN) on which riverscape covariates are measured. We then introduce the novel bidirectional geneflow in riverscapes (BGR) model that uses principles of isolation‐by‐resistance to quantify the effects of environmental covariates on genetic connectivity, with spatial covariance defined using simultaneous autoregressive models on the SSEN and the generalized Wishart distribution to model pairwise distance matrices arising through a random walk model of geneflow. We highlight the utility of this framework in an analysis of riverscape genetics for brook trout (Salvelinus fontinalis) in north central Pennsylvania, USA. Using the fixation index (FST) as the measure of genetic distance, we estimated the effects of 12 riverscape covariates on geneflow by evaluating the relative support of eight competing BGR models. We then compared the performance of the top‐ranked BGR model to results obtained from comparable analyses using multiple regression on distance matrices (MRM) and the program STRUCTURE. We found that the BGR model had more power to detect covariate effects, particularly for variables that were only partial barriers to geneflow and/or uncommon in the riverscape, making it more informative for assessing patterns of population connectivity and identifying threats to species conservation. This case study highlights the utility of our modeling framework over other quantitative methods in riverscape genetics, particularly the ability to rigorously test hypotheses about factors that influence geneflow and probabilistically estimate the effect of riverscape covariates, including stream flow direction. This framework is flexible across taxa and riverine networks, is easily executable, and provides intuitive results that can be used to investigate the likely outcomes of current and future management scenarios.

     
    more » « less
  3. Abstract

    Measuring forest biodiversity using terrestrial surveys is expensive and can only capture common species abundance in large heterogeneous landscapes. In contrast, combining airborne imagery with computer vision can generate individual tree data at the scales of hundreds of thousands of trees. To train computer vision models, ground‐based species labels are combined with airborne reflectance data. Due to the difficulty of finding rare species in a large landscape, many classification models only include the most abundant species, leading to biased predictions at broad scales. For example, if only common species are used to train the model, this assumes that these samples are representative across the entire landscape. Extending classification models to include rare species requires targeted data collection and algorithmic improvements to overcome large data imbalances between dominant and rare taxa. We use a targeted sampling workflow to the Ordway Swisher Biological Station within the US National Ecological Observatory Network (NEON), where traditional forestry plots had identified six canopy tree species with more than 10 individuals at the site. Combining iterative model development with rare species sampling, we extend a training dataset to include 14 species. Using a multi‐temporal hierarchical model, we demonstrate the ability to include species predicted at <1% frequency in landscape without losing performance on the dominant species. The final model has over 75% accuracy for 14 species with improved rare species classification compared to 61% accuracy of a baseline deep learning model. After filtering out dead trees, we generate landscape species maps of individual crowns for over 670 000 individual trees. We find distinct patches of forest composed of rarer species at the full‐site scale, highlighting the importance of capturing species diversity in training data. We estimate the relative abundance of 14 species within the landscape and provide three measures of uncertainty to generate a range of counts for each species. For example, we estimate that the dominant species,Pinus palustrisaccounts for c. 28% of predicted stems, with models predicting a range of counts between 160 000 and 210 000 individuals. These maps provide the first estimates of canopy tree diversity within a NEON site to include rare species and provide a blueprint for capturing tree diversity using airborne computer vision at broad scales.

     
    more » « less
  4. Abstract Aim

    Species distribution models (SDMs) that integrate presence‐only and presence–absence data offer a promising avenue to improve information on species' geographic distributions. The use of such ‘integrated SDMs’ on a species range‐wide extent has been constrained by the often limited presence–absence data and by the heterogeneous sampling of the presence‐only data. Here, we evaluate integrated SDMs for studying species ranges with a novel expert range map‐based evaluation. We build new understanding about how integrated SDMs address issues of estimation accuracy and data deficiency and thereby offer advantages over traditional SDMs.

    Location

    South and Central America.

    Time Period

    1979–2017.

    Major Taxa Studied

    Hummingbirds.

    Methods

    We build integrated SDMs by linking two observation models – one for each data type – to the same underlying spatial process. We validate SDMs with two schemes: (i) cross‐validation with presence–absence data and (ii) comparison with respect to the species' whole range as defined with IUCN range maps. We also compare models relative to the estimated response curves and compute the association between the benefit of the data integration and the number of presence records in each data set.

    Results

    The integrated SDM accounting for the spatially varying sampling intensity of the presence‐only data was one of the top performing models in both model validation schemes. Presence‐only data alleviated overly large niche estimates, and data integration was beneficial compared to modelling solely presence‐only data for species which had few presence points when predicting the species' whole range. On the community level, integrated models improved the species richness prediction.

    Main Conclusions

    Integrated SDMs combining presence‐only and presence–absence data are successfully able to borrow strengths from both data types and offer improved predictions of species' ranges. Integrated SDMs can potentially alleviate the impacts of taxonomically and geographically uneven sampling and to leverage the detailed sampling information in presence–absence data.

     
    more » « less
  5. Abstract

    Ecological distance‐based spatial capture–recapture models (SCR) are a promising approach for simultaneously estimating animal density and connectivity, both of which affect spatial population processes and ultimately species persistence. We explored how SCR models can be integrated into reserve‐design frameworks that explicitly acknowledge both the spatial distribution of individuals and their space use resulting from landscape structure. We formulated the design of wildlife reserves as a budget‐constrained optimization problem and conducted a simulation to explore 3 different SCR‐informed optimization objectives that prioritized different conservation goals by maximizing the number of protected individuals, reserve connectivity, and density‐weighted connectivity. We also studied the effect on our 3 objectives of enforcing that the space‐use requirements of individuals be met by the reserve for individuals to be considered conserved (referred to as home‐range constraints). Maximizing local population density resulted in fragmented reserves that would likely not aid long‐term population persistence, and maximizing the connectivity objective yielded reserves that protected the fewest individuals. However, maximizing density‐weighted connectivity or preemptively imposing home‐range constraints on reserve design yielded reserves of largely spatially compact sets of parcels covering high‐density areas in the landscape with high functional connectivity between them. Our results quantify the extent to which reserve design is constrained by individual home‐range requirements and highlight that accounting for individual space use in the objective and constraints can help in the design of reserves that balance abundance and connectivity in a biologically relevant manner.

     
    more » « less