skip to main content


Title: Overcoming the challenge of small effective sample sizes in home‐range estimation
Abstract

Technological advances have steadily increased the detail of animal tracking datasets, yet fundamental data limitations exist for many species that cause substantial biases in home‐range estimation. Specifically, the effective sample size of a range estimate is proportional to the number of observed range crossings, not the number of sampled locations. Currently, the most accurate home‐range estimators condition on an autocorrelation model, for which the standard estimation frame‐works are based on likelihood functions, even though these methods are known to underestimate variance—and therefore ranging area—when effective sample sizes are small.

Residual maximum likelihood (REML) is a widely used method for reducing bias in maximum‐likelihood (ML) variance estimation at small sample sizes. Unfortunately, we find that REML is too unstable for practical application to continuous‐time movement models. When the effective sample sizeNis decreased toN ≤ (10), which is common in tracking applications, REML undergoes a sudden divergence in variance estimation. To avoid this issue, while retaining REML’s first‐order bias correction, we derive a family of estimators that leverage REML to make a perturbative correction to ML. We also derive AIC values for REML and our estimators, including cases where model structures differ, which is not generally understood to be possible.

Using both simulated data and GPS data from lowland tapir (Tapirus terrestris), we show how our perturbative estimators are more accurate than traditional ML and REML methods. Specifically, when(5) home‐range crossings are observed, REML is unreliable by orders of magnitude, ML home ranges are ~30% underestimated, and our perturbative estimators yield home ranges that are only ~10% underestimated. A parametric bootstrap can then reduce the ML and perturbative home‐range underestimation to ~10% and ~3%, respectively.

Home‐range estimation is one of the primary reasons for collecting animal tracking data, and small effective sample sizes are a more common problem than is currently realized. The methods introduced here allow for more accurate movement‐model and home‐range estimation at small effective sample sizes, and thus fill an important role for animal movement analysis. Given REML’s widespread use, our methods may also be useful in other contexts where effective sample sizes are small.

 
more » « less
NSF-PAR ID:
10460744
Author(s) / Creator(s):
 ;  ;  ;  ;
Publisher / Repository:
Wiley-Blackwell
Date Published:
Journal Name:
Methods in Ecology and Evolution
Volume:
10
Issue:
10
ISSN:
2041-210X
Page Range / eLocation ID:
p. 1679-1689
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Projects focused on movement behaviour and home range are commonplace, but beyond a focus on choosing appropriate research questions, there are no clear guidelines for such studies. Without these guidelines, designing an animal tracking study to produce reliable estimates of space‐use and movement properties (necessary to answer basic movement ecology questions), is often done in an ad hoc manner.

    We developed ‘movedesign’, a user‐friendly Shiny application, which can be utilized to investigate the precision of three estimates regularly reported in movement and spatial ecology studies: home range area, speed and distance travelled. Conceptually similar to statistical power analysis, this application enables users to assess the degree of estimate precision that may be achieved with a given sampling design; that is, the choices regarding data resolution (sampling interval) and battery life (sampling duration).

    Leveraging the ‘ctmmRpackage, we utilize two methods proven to handle many common biases in animal movement datasets: autocorrelated kernel density estimators (AKDEs) and continuous‐time speed and distance (CTSD) estimators. Longer sampling durations are required to reliably estimate home range areas via the detection of a sufficient number of home range crossings. In contrast, speed and distance estimation requires a sampling interval short enough to ensure that a statistically significant signature of the animal's velocity remains in the data.

    This application addresses key challenges faced by researchers when designing tracking studies, including the trade‐off between long battery life and high resolution of GPS locations collected by the devices, which may result in a compromise between reliably estimating home range or speed and distance. ‘movedesign’ has broad applications for researchers and decision‐makers, supporting them to focus efforts and resources in achieving the optimal sampling design strategy for their research questions, prioritizing the correct deployment decisions for insightful and reliable outputs, while understanding the trade‐off associated with these choices.

     
    more » « less
  2. Key points

    The beneficial effects of sustained or lifelong (>25 years) endurance exercise on cardiovascular structure and exercise function have been largely established in men.

    The current findings indicate that committed (≥4 weekly exercise sessions) lifelong exercise results in substantial benefits in exercise capacity (), cardiovascular function at submaximal and maximal exercise, left ventricular mass and compliance, and blood volume compared to similarly aged or even younger (middle‐age) untrained women.

    Endurance exercise training should be considered a key strategy to prevent cardiovascular disease with ageing in women as well as men.

    Abstract

    This study was a retrospective, cross‐sectional analysis of exercise performance and left ventricular (LV) morphology in 70 women to examine whether women who have performed regular, lifelong endurance exercise acquire the same beneficial adaptations in cardiovascular structure and function and exercise performance that have been reported previously in men. Three groups of women were examined: (1) 35 older (>60 years) untrained women (older untrained, OU), (2) 13 older women who had consistently performed four or more endurance exercise sessions weekly for at least 25 years (older trained, OT), and (3) 22 middle‐aged (range 35–59 years) untrained women (middle‐aged untrained, MU) as a reference control for the appropriate age‐related changes. Oxygen uptake () and cardiovascular function (cardiac output (); stroke volume (SV) acetylene rebreathing) were examined at rest, steady‐state submaximal exercise and maximal exercise (maximal oxygen uptake,). Blood volume (CO rebreathing) and LV mass (cardiac magnetic resonance imaging), plus invasive measures of static and dynamic chamber compliance were also examined.(p < 0.001) and maximal exerciseand SV were larger in older trained women compared to the two untrained groups (∼17% and ∼27% forand SV, respectively,versusMU; ∼40% and ∼38%versusOU, allp < 0.001). Blood volume (mL kg−1) and LV mass index (g m−2) were larger in OTversusOU (∼11% and ∼16%, respectively, bothP ≤ 0.015) Static LV chamber compliance was greater in OT compared to both untrained groups (median (25–75%): MU: 0.065 (0.049–0.080); OU: 0.085 (0.061–0.138); OT: 0.047 (0.031–0.054),P ≤ 0.053). Collectively, these findings indicate that lifetime endurance exercise appears to be extremely effective at preserving or even enhancing cardiovascular structure and function with advanced age in women.

     
    more » « less
  3. Abstract

    The genetic effective population size,Ne, can be estimated from the average gametic disequilibrium () between pairs of loci, but such estimates require evaluation of assumptions and currently have few methods to estimate confidence intervals.speed‐neis a suite ofmatlabcomputer code functions to estimatefromwith a graphical user interface and a rich set of outputs that aid in understanding data patterns and comparing multiple estimators.speed‐neincludes functions to either generate or input simulated genotype data to facilitate comparative studies ofestimators under various population genetic scenarios.speed‐newas validated with data simulated under both time‐forward and time‐backward coalescent models of genetic drift. Three classes of estimators were compared with simulated data to examine several general questions: what are the impacts of microsatellite null alleles on,how should missing data be treated, and does disequilibrium contributed by reduced recombination among some loci in a sample impact. Estimators differed greatly in precision in the scenarios examined, and a widely employedestimator exhibited the largest variances among replicate data sets.speed‐neimplements several jackknife approaches to estimate confidence intervals, and simulated data showed that jackknifing over loci and jackknifing over individuals provided ~95% confidence interval coverage for some estimators and should be useful for empirical studies.speed‐neprovides an open‐source extensible tool for estimation offrom empirical genotype data and to conduct simulations of both microsatellite and single nucleotide polymorphism (SNP) data types to develop expectations and to compareestimators.

     
    more » « less
  4. Abstract

    Home range estimation is routine practice in ecological research. While advances in animal tracking technology have increased our capacity to collect data to support home range analysis, these same advances have also resulted in increasingly autocorrelated data. Consequently, the question of which home range estimator to use on modern, highly autocorrelated tracking data remains open. This question is particularly relevant given that most estimators assume independently sampled data. Here, we provide a comprehensive evaluation of the effects of autocorrelation on home range estimation. We base our study on an extensive data set ofGPSlocations from 369 individuals representing 27 species distributed across five continents. We first assemble a broad array of home range estimators, including Kernel Density Estimation (KDE) with four bandwidth optimizers (Gaussian reference function, autocorrelated‐Gaussian reference function [AKDE], Silverman's rule of thumb, and least squares cross‐validation), Minimum Convex Polygon, and Local Convex Hull methods. Notably, all of these estimators exceptAKDEassume independent and identically distributed (IID) data. We then employ half‐sample cross‐validation to objectively quantify estimator performance, and the recently introduced effective sample size for home range area estimation () to quantify the information content of each data set. We found thatAKDE95% area estimates were larger than conventionalIID‐based estimates by a mean factor of 2. The median number of cross‐validated locations included in the hold‐out sets byAKDE95% (or 50%) estimates was 95.3% (or 50.1%), confirming the largerAKDEranges were appropriately selective at the specified quantile. Conversely, conventional estimates exhibited negative bias that increased with decreasing . To contextualize our empirical results, we performed a detailed simulation study to tease apart how sampling frequency, sampling duration, and the focal animal's movement conspire to affect range estimates. Paralleling our empirical results, the simulation study demonstrated thatAKDEwas generally more accurate than conventional methods, particularly for small . While 72% of the 369 empirical data sets had >1,000 total observations, only 4% had an >1,000, where 30% had an <30. In this frequently encountered scenario of small ,AKDEwas the only estimator capable of producing an accurate home range estimate on autocorrelated data.

     
    more » « less
  5. Abstract

    Ecologists have long been interested in linking individual behaviour with higher level processes. For motile species, this ‘upscaling’ is governed by how well any given movement strategy maximizes encounters with positive factors and minimizes encounters with negative factors. Despite the importance of encounter events for a broad range of ecological processes, encounter theory has not kept pace with developments in animal tracking or movement modelling. Furthermore, existing work has focused primarily on the relationship between animal movement and encounterrateswhile the relationship between individual movement and the spatiallocationsof encounter events in the environment has remained conspicuously understudied.

    Here, we bridge this gap by introducing a method for describing the long‐term encounter location probabilities for movement within home ranges, termed the conditional distribution of encounters (CDE). We then derive this distribution, as well as confidence intervals, implement its statistical estimator into open‐source software and demonstrate the broad ecological relevance of this distribution.

    We first use simulated data to show how our estimator provides asymptotically consistent estimates. We then demonstrate the general utility of this method for three simulation‐based scenarios that occur routinely in biological systems: (a) a population of individuals with home ranges that overlap with neighbours; (b) a pair of individuals with a hard territorial border between their home ranges; and (c) a predator with a large home range that encompassed the home ranges of multiple prey individuals. Using GPS data from white‐faced capuchinsCebus capucinus, tracked on Barro Colorado Island, Panama, and sleepy lizardsTiliqua rugosa,tracked in Bundey, South Australia, we then show how the CDE can be used to estimate the locations of territorial borders, identify key resources, quantify the potential for competitive or predatory interactions and/or identify any changes in behaviour that directly result from location‐specific encounter probability.

    The CDE enables researchers to better understand the dynamics of populations of interacting individuals. Notably, the general estimation framework developed in this work builds straightforwardly off of home range estimation and requires no specialized data collection protocols. This method is now openly available via thectmm Rpackage.

     
    more » « less