skip to main content


Title: Automated Earthwork Detection Using Topological Persistence
Abstract

For thousands of years, humans have altered the movement of water through construction of earthworks. These earthworks remain in landscapes, where they continue to alter hydrology, even where structures have long since been abandoned. Management of lands containing earthworks requires an understanding of how the earthworks impact hydrology and knowledge of where the structures are located in the landscape. Various methods for detecting topographic features exist in the literature, including a set of rule and threshold‐based techniques and machine learning methods. These tools are either labor‐intensive or require special pre‐processing or a priori assumptions about structures that limit generalizability. Here, we test a topological analysis tool called “persistence” to determine if it is useful for earthwork detection in rangelands. We found that persistence can be used to detect earthworks with 83% precision and 64% accuracy. Breached berms and berms with significant upslope sedimentation are most likely not to be detected using persistence. These results indicate that persistence can be useful for terrain analysis, and it has the potential to substantially reduce manual effort in feature detection by identifying regions where berms may be found.

 
more » « less
NSF-PAR ID:
10494530
Author(s) / Creator(s):
 ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Water Resources Research
Volume:
60
Issue:
2
ISSN:
0043-1397
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Combining information from active and passive sampling of mobile animals is challenging because active‐sampling data are affected by limited detection of rare or sparse taxa, while passive‐sampling data reflect both density and movement. We propose that a model‐based analysis allows information to be combined between these methods to interpret variation in the relationship between active estimates of density and passive measurements of catch per unit effort to yield novel information on activity rates (distance/time). We illustrate where discrepancies arise between active and passive methods and demonstrate the model‐based approach with seasonal surveys of fish assemblages in the Florida Everglades, where data are derived from concurrent sampling with throw traps, an enclosure‐type sampler producing point estimates of density, and drift fences with unbaited minnow traps that measure catch per unit effort (CPUE). We compared incidence patterns generated by active and passive sampling, used hierarchical Bayesian modeling to quantify the detection ability of each method, characterized interspecific and seasonal variation in the relationship between density and passively measuredCPUE, and used a predator encounter‐rate model to convert variableCPUE–density relationships into ecological information on activity rates. Activity rate information was used to compare interspecific responses to seasonal hydrology and to quantify spatial variation in non‐native fish activity. Drift fences had higher detection probabilities for rare and sparse species than throw traps, causing discrepancies in the estimated spatial distribution of non‐native species from passively measuredCPUEand actively measured density. Detection probability of the passive sampler, but not the active sampler, varied seasonally with changes in water depth. The relationship betweenCPUEand density was sensitive to fluctuating depth, with most species not having a proportional relationship betweenCPUEand density until seasonal declines in depth. Activity rate estimates revealed interspecific differences in response to declining depths and identified locations and species with high rates of activity. We propose that variation in catchability from methods that passively measureCPUEcan be sources of ecological information on activity. We also suggest that model‐based combining of data types could be a productive approach for analyzing correspondence of incidence and abundance patterns in other applications.

     
    more » « less
  2. Abstract

    The interface between field biology and technology is energizing the collection of vast quantities of environmental data. Passive acoustic monitoring, the use of unattended recording devices to capture environmental sound, is an example where technological advances have facilitated an influx of data that routinely exceeds the capacity for analysis. Computational advances, particularly the integration of machine learning approaches, will support data extraction efforts. However, the analysis and interpretation of these data will require parallel growth in conceptual and technical approaches for data analysis. Here, we use a large hand‐annotated dataset to showcase analysis approaches that will become increasingly useful as datasets grow and data extraction can be partially automated.

    We propose and demonstrate seven technical approaches for analyzing bioacoustic data. These include the following: (1) generating species lists and descriptions of vocal variation, (2) assessing how abiotic factors (e.g., rain and wind) impact vocalization rates, (3) testing for differences in community vocalization activity across sites and habitat types, (4) quantifying the phenology of vocal activity, (5) testing for spatiotemporal correlations in vocalizations within species, (6) among species, and (7) using rarefaction analysis to quantify diversity and optimize bioacoustic sampling.

    To demonstrate these approaches, we sampled in 2016 and 2018 and used hand annotations of 129,866 bird vocalizations from two forests in New Hampshire, USA, including sites in the Hubbard Brook Experiment Forest where bioacoustic data could be integrated with more than 50 years of observer‐based avian studies. Acoustic monitoring revealed differences in community patterns in vocalization activity between forests of different ages, as well as between nearby similar watersheds. Of numerous environmental variables that were evaluated, background noise was most clearly related to vocalization rates. The songbird community included one cluster of species where vocalization rates declined as ambient noise increased and another cluster where vocalization rates declined over the nesting season. In some common species, the number of vocalizations produced per day was correlated at scales of up to 15 km. Rarefaction analyses showed that adding sampling sites increased species detections more than adding sampling days.

    Although our analyses used hand‐annotated data, the methods will extend readily to large‐scale automated detection of vocalization events. Such data are likely to become increasingly available as autonomous recording units become more advanced, affordable, and power efficient. Passive acoustic monitoring with human or automated identification at the species level offers growing potential to complement observer‐based studies of avian ecology.

     
    more » « less
  3. Abstract

    Technological advances have steadily increased the detail of animal tracking datasets, yet fundamental data limitations exist for many species that cause substantial biases in home‐range estimation. Specifically, the effective sample size of a range estimate is proportional to the number of observed range crossings, not the number of sampled locations. Currently, the most accurate home‐range estimators condition on an autocorrelation model, for which the standard estimation frame‐works are based on likelihood functions, even though these methods are known to underestimate variance—and therefore ranging area—when effective sample sizes are small.

    Residual maximum likelihood (REML) is a widely used method for reducing bias in maximum‐likelihood (ML) variance estimation at small sample sizes. Unfortunately, we find that REML is too unstable for practical application to continuous‐time movement models. When the effective sample sizeNis decreased toN ≤ (10), which is common in tracking applications, REML undergoes a sudden divergence in variance estimation. To avoid this issue, while retaining REML’s first‐order bias correction, we derive a family of estimators that leverage REML to make a perturbative correction to ML. We also derive AIC values for REML and our estimators, including cases where model structures differ, which is not generally understood to be possible.

    Using both simulated data and GPS data from lowland tapir (Tapirus terrestris), we show how our perturbative estimators are more accurate than traditional ML and REML methods. Specifically, when(5) home‐range crossings are observed, REML is unreliable by orders of magnitude, ML home ranges are ~30% underestimated, and our perturbative estimators yield home ranges that are only ~10% underestimated. A parametric bootstrap can then reduce the ML and perturbative home‐range underestimation to ~10% and ~3%, respectively.

    Home‐range estimation is one of the primary reasons for collecting animal tracking data, and small effective sample sizes are a more common problem than is currently realized. The methods introduced here allow for more accurate movement‐model and home‐range estimation at small effective sample sizes, and thus fill an important role for animal movement analysis. Given REML’s widespread use, our methods may also be useful in other contexts where effective sample sizes are small.

     
    more » « less
  4. Introduction Campylobacter spp. infections are responsible for significant diarrheal disease burden across the globe, with prevalence thought to be increasing. Although wild avian species have been studied as reservoirs of Campylobacter spp., our understanding of the role of wild mammalian species in disease transmission and persistence is limited. Host factors influencing infection dynamics in wild mammals have been neglected, particularly life traits, and the role of these factors in zoonotic spillover risk is largely unknown. Methods Here, we conducted a systematic literature review, identifying mammalian species that had been tested for Campylobacter spp. infections (molecular and culture based). We used logistic regression to evaluate the relationship between the detection of Campylobacter spp. in feces and host life traits (urban association, trophic level, and sociality). Results Our analysis suggest that C. jejuni transmission is associated with urban living and trophic level. The probability of carriage was highest in urban-associated species ( p = 0.02793) and the most informative model included trophic level. In contrast, C. coli carriage appears to be strongly influenced by sociality ( p = 0.0113) with trophic level still being important. Detection of Campylobacter organisms at the genus level, however, was only associated with trophic level ( p = 0.0156), highlighting the importance of this trait in exposure dynamics across host and Campylobacter pathogen systems. Discussion While many challenges remain in the detection and characterization of Camploybacter spp., these results suggest that host life traits may have important influence on pathogen exposure and transmission dynamics, providing a useful starting point for more directed surveillance approaches. 
    more » « less
  5. Abstract

    Historical museum records provide potentially useful data for identifying drivers of change in species occupancy. However, because museum records are typically obtained via many collection methods, methodological developments are needed to enable robust inferences. Occupancy–detection models, a relatively new and powerful suite of statistical methods, are a potentially promising avenue because they can account for changes in collection effort through space and time.

    We use simulated datasets to identify how and when patterns in data and/or modelling decisions can bias inference. We focus primarily on the consequences of contrasting methodological approaches for dealing with species' ranges and inferring species' non‐detections in both space and time.

    We find that not all datasets are suitable for occupancy–detection analysis but, under the right conditions (namely, datasets that are broken into more time periods for occupancy inference and that contain a high fraction of community‐wide collections, or collection events that focus on communities of organisms), models can accurately estimate trends. Finally, we present a case study on eastern North American odonates where we calculate long‐term trends of occupancy using our most robust workflow.

    These results indicate that occupancy–detection models are a suitable framework for some research cases and expand the suite of available tools for macroecological analysis available to researchers, especially where structured datasets are unavailable.

     
    more » « less