skip to main content


Title: movedesign: Shiny R app to evaluate sampling design for animal movement studies
Abstract

Projects focused on movement behaviour and home range are commonplace, but beyond a focus on choosing appropriate research questions, there are no clear guidelines for such studies. Without these guidelines, designing an animal tracking study to produce reliable estimates of space‐use and movement properties (necessary to answer basic movement ecology questions), is often done in an ad hoc manner.

We developed ‘movedesign’, a user‐friendly Shiny application, which can be utilized to investigate the precision of three estimates regularly reported in movement and spatial ecology studies: home range area, speed and distance travelled. Conceptually similar to statistical power analysis, this application enables users to assess the degree of estimate precision that may be achieved with a given sampling design; that is, the choices regarding data resolution (sampling interval) and battery life (sampling duration).

Leveraging the ‘ctmmRpackage, we utilize two methods proven to handle many common biases in animal movement datasets: autocorrelated kernel density estimators (AKDEs) and continuous‐time speed and distance (CTSD) estimators. Longer sampling durations are required to reliably estimate home range areas via the detection of a sufficient number of home range crossings. In contrast, speed and distance estimation requires a sampling interval short enough to ensure that a statistically significant signature of the animal's velocity remains in the data.

This application addresses key challenges faced by researchers when designing tracking studies, including the trade‐off between long battery life and high resolution of GPS locations collected by the devices, which may result in a compromise between reliably estimating home range or speed and distance. ‘movedesign’ has broad applications for researchers and decision‐makers, supporting them to focus efforts and resources in achieving the optimal sampling design strategy for their research questions, prioritizing the correct deployment decisions for insightful and reliable outputs, while understanding the trade‐off associated with these choices.

 
more » « less
Award ID(s):
1915347
NSF-PAR ID:
10420311
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
ISSN:
2041-210X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Technological advances have steadily increased the detail of animal tracking datasets, yet fundamental data limitations exist for many species that cause substantial biases in home‐range estimation. Specifically, the effective sample size of a range estimate is proportional to the number of observed range crossings, not the number of sampled locations. Currently, the most accurate home‐range estimators condition on an autocorrelation model, for which the standard estimation frame‐works are based on likelihood functions, even though these methods are known to underestimate variance—and therefore ranging area—when effective sample sizes are small.

    Residual maximum likelihood (REML) is a widely used method for reducing bias in maximum‐likelihood (ML) variance estimation at small sample sizes. Unfortunately, we find that REML is too unstable for practical application to continuous‐time movement models. When the effective sample sizeNis decreased toN ≤ (10), which is common in tracking applications, REML undergoes a sudden divergence in variance estimation. To avoid this issue, while retaining REML’s first‐order bias correction, we derive a family of estimators that leverage REML to make a perturbative correction to ML. We also derive AIC values for REML and our estimators, including cases where model structures differ, which is not generally understood to be possible.

    Using both simulated data and GPS data from lowland tapir (Tapirus terrestris), we show how our perturbative estimators are more accurate than traditional ML and REML methods. Specifically, when(5) home‐range crossings are observed, REML is unreliable by orders of magnitude, ML home ranges are ~30% underestimated, and our perturbative estimators yield home ranges that are only ~10% underestimated. A parametric bootstrap can then reduce the ML and perturbative home‐range underestimation to ~10% and ~3%, respectively.

    Home‐range estimation is one of the primary reasons for collecting animal tracking data, and small effective sample sizes are a more common problem than is currently realized. The methods introduced here allow for more accurate movement‐model and home‐range estimation at small effective sample sizes, and thus fill an important role for animal movement analysis. Given REML’s widespread use, our methods may also be useful in other contexts where effective sample sizes are small.

     
    more » « less
  2. Abstract

    Network analysis of infectious disease in wildlife can reveal traits or individuals critical to pathogen transmission and help inform disease management strategies. However, estimates of contact between animals are notoriously difficult to acquire. Researchers commonly use telemetry technologies to identify animal associations, but such data may have different sampling intervals and often captures a small subset of the population. The objectives of this study were to outline best practices for telemetry sampling in network studies of infectious disease by determining (a) the consequences of telemetry sampling on our ability to estimate network structure, (b) whether contact networks can be approximated using purely spatial contact definitions and (c) how wildlife spatial configurations may influence telemetry sampling requirements.

    We simulated individual movement trajectories for wildlife populations using a home range‐like movement model, creating full location datasets and corresponding ‘complete’ networks. To mimic telemetry data, we created ‘sample’ networks by subsampling the population (10%–100% of individuals) with a range of sampling intervals (every minute to every 3 days). We varied the definition of contact for sample networks, using either spatiotemporal or spatial overlap, and varied the spatial configuration of populations (random, lattice or clustered). To compare complete and sample networks, we calculated seven network metrics important for disease transmission and assessed mean ranked correlation coefficients and percent error between complete and sample network metrics.

    Telemetry sampling severely reduced our ability to calculate global node‐level network metrics, but had less impact on local and network‐level metrics. Even so, in populations with infrequent associations, high intensity telemetry sampling may still be necessary. Defining contact in terms of spatial overlap generally resulted in overly connected networks, but in some instances, could compensate for otherwise coarse telemetry data.

    By synthesizing movement and disease ecology with computational approaches, we characterized trade‐offs important for using wildlife telemetry data beyond ecological studies of individual movement, and found that careful use of telemetry data has the potential to inform network models. Thus, with informed application of telemetry data, we can make significant advances in leveraging its use for a better understanding and management of wildlife infectious disease.

     
    more » « less
  3. Abstract

    Home range estimation is routine practice in ecological research. While advances in animal tracking technology have increased our capacity to collect data to support home range analysis, these same advances have also resulted in increasingly autocorrelated data. Consequently, the question of which home range estimator to use on modern, highly autocorrelated tracking data remains open. This question is particularly relevant given that most estimators assume independently sampled data. Here, we provide a comprehensive evaluation of the effects of autocorrelation on home range estimation. We base our study on an extensive data set ofGPSlocations from 369 individuals representing 27 species distributed across five continents. We first assemble a broad array of home range estimators, including Kernel Density Estimation (KDE) with four bandwidth optimizers (Gaussian reference function, autocorrelated‐Gaussian reference function [AKDE], Silverman's rule of thumb, and least squares cross‐validation), Minimum Convex Polygon, and Local Convex Hull methods. Notably, all of these estimators exceptAKDEassume independent and identically distributed (IID) data. We then employ half‐sample cross‐validation to objectively quantify estimator performance, and the recently introduced effective sample size for home range area estimation () to quantify the information content of each data set. We found thatAKDE95% area estimates were larger than conventionalIID‐based estimates by a mean factor of 2. The median number of cross‐validated locations included in the hold‐out sets byAKDE95% (or 50%) estimates was 95.3% (or 50.1%), confirming the largerAKDEranges were appropriately selective at the specified quantile. Conversely, conventional estimates exhibited negative bias that increased with decreasing . To contextualize our empirical results, we performed a detailed simulation study to tease apart how sampling frequency, sampling duration, and the focal animal's movement conspire to affect range estimates. Paralleling our empirical results, the simulation study demonstrated thatAKDEwas generally more accurate than conventional methods, particularly for small . While 72% of the 369 empirical data sets had >1,000 total observations, only 4% had an >1,000, where 30% had an <30. In this frequently encountered scenario of small ,AKDEwas the only estimator capable of producing an accurate home range estimate on autocorrelated data.

     
    more » « less
  4. Abstract

    Ecologists have long been interested in linking individual behaviour with higher level processes. For motile species, this ‘upscaling’ is governed by how well any given movement strategy maximizes encounters with positive factors and minimizes encounters with negative factors. Despite the importance of encounter events for a broad range of ecological processes, encounter theory has not kept pace with developments in animal tracking or movement modelling. Furthermore, existing work has focused primarily on the relationship between animal movement and encounterrateswhile the relationship between individual movement and the spatiallocationsof encounter events in the environment has remained conspicuously understudied.

    Here, we bridge this gap by introducing a method for describing the long‐term encounter location probabilities for movement within home ranges, termed the conditional distribution of encounters (CDE). We then derive this distribution, as well as confidence intervals, implement its statistical estimator into open‐source software and demonstrate the broad ecological relevance of this distribution.

    We first use simulated data to show how our estimator provides asymptotically consistent estimates. We then demonstrate the general utility of this method for three simulation‐based scenarios that occur routinely in biological systems: (a) a population of individuals with home ranges that overlap with neighbours; (b) a pair of individuals with a hard territorial border between their home ranges; and (c) a predator with a large home range that encompassed the home ranges of multiple prey individuals. Using GPS data from white‐faced capuchinsCebus capucinus, tracked on Barro Colorado Island, Panama, and sleepy lizardsTiliqua rugosa,tracked in Bundey, South Australia, we then show how the CDE can be used to estimate the locations of territorial borders, identify key resources, quantify the potential for competitive or predatory interactions and/or identify any changes in behaviour that directly result from location‐specific encounter probability.

    The CDE enables researchers to better understand the dynamics of populations of interacting individuals. Notably, the general estimation framework developed in this work builds straightforwardly off of home range estimation and requires no specialized data collection protocols. This method is now openly available via thectmm Rpackage.

     
    more » « less
  5. Abstract

    Understanding animal movement often relies upon telemetry and biologging devices. These data are frequently used to estimate latent behavioural states to help understand why animals move across the landscape. While there are a variety of methods that make behavioural inferences from biotelemetry data, some features of these methods (e.g. analysis of a single data stream, use of parametric distributions) may limit their generality to reliably discriminate among behavioural states.

    To address some of the limitations of existing behavioural state estimation models, we introduce a nonparametric Bayesian framework called the mixed‐membership method for movement (M4), which is available within the open‐sourcebayesmoveR package. This framework can analyse multiple data streams (e.g. step length, turning angle, acceleration) without relying on parametric distributions, which may capture complex behaviours more successfully than current methods. We tested our Bayesian framework using simulated trajectories and compared model performance against two segmentation methods (behavioural change point analysis (BCPA) and segclust2d), one machine learning method [expectation‐maximization binary clustering (EMbC)] and one type of state‐space model [hidden Markov model (HMM)]. We also illustrated this Bayesian framework using movements of juvenile snail kitesRostrhamus sociabilisin Florida, USA.

    The Bayesian framework estimated breakpoints more accurately than the other segmentation methods for tracks of different lengths. Likewise, the Bayesian framework provided more accurate estimates of behaviour than the other state estimation methods when simulations were generated from less frequently considered distributions (e.g. truncated normal, beta, uniform). Three behavioural states were estimated from snail kite movements, which were labelled as ‘encamped’, ‘area‐restricted search’ and ‘transit’. Changes in these behaviours over time were associated with known dispersal events from the nest site, as well as movements to and from possible breeding locations.

    Our nonparametric Bayesian framework estimated behavioural states with comparable or superior accuracy compared to the other methods when step lengths and turning angles of simulations were generated from less frequently considered distributions. Since the most appropriate parametric distributions may not be obvious a priori, methods (such as M4) that are agnostic to the underlying distributions can provide powerful alternatives to address questions in movement ecology.

     
    more » « less