skip to main content

Title: Experimenting With the Past to Improve Environmental Monitoring
Long-term monitoring programs are a fundamental part of both understanding ecological systems and informing management decisions. However, there are many constraints which might prevent monitoring programs from being designed to consider statistical power, site selection, or the full costs and benefits of monitoring. Key considerations can be incorporated into the optimal design of a management program with simulations and experiments. Here, we advocate for the expanded use of a third approach: non-random resampling of previously-collected data. This approach conducts experiments with available data to understand the consequences of different monitoring approaches. We first illustrate non-random resampling in determining the optimal length and frequency of monitoring programs to assess species trends. We then apply the approach to a pair of additional case studies, from fisheries and agriculture. Non-random resampling of previously-collected data is underutilized, but has the potential to improve monitoring programs.  more » « less
Award ID(s):
1027253 1637653 1832042 1838807
Author(s) / Creator(s):
Date Published:
Journal Name:
Frontiers in Ecology and Evolution
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Background Understanding how study design and monitoring strategies shape inference within, and synthesis across, studies is critical across biological disciplines. Many biological and field studies are short term and limited in scope. Monitoring studies are critical for informing public health about potential vectors of concern, such as Ixodes scapularis (black-legged ticks). Black-legged ticks are a taxon of ecological and human health concern due to their status as primary vectors of Borrelia burgdorferi , the bacteria that transmits Lyme disease. However, variation in black-legged tick monitoring, and gaps in data, are currently considered major barriers to understanding population trends and in turn, predicting Lyme disease risk. To understand how variable methodology in black-legged tick studies may influence which population patterns researchers find, we conducted a data synthesis experiment. Materials and Methods We searched for publicly available black-legged tick abundance dataset that had at least 9 years of data, using keywords about ticks in internet search engines, literature databases, data repositories and public health websites. Our analysis included 289 datasets from seven surveys from locations in the US, ranging in length from 9 to 24 years. We used a moving window analysis, a non-random resampling approach, to investigate the temporal stability of black-legged tick population trajectories across the US. We then used t-tests to assess differences in stability time across different study parameters. Results All of our sampled datasets required 4 or more years to reach stability. We also found several study factors can have an impact on the likelihood of a study reaching stability and of data leading to misleading results if the study does not reach stability. Specifically, datasets collected via dragging reached stability significantly faster than data collected via opportunistic sampling. Datasets that sampled larva reached stability significantly later than those that sampled adults or nymphs. Additionally, datasets collected at the broadest spatial scale (county) reached stability fastest. Conclusion We used 289 datasets from seven long term black-legged tick studies to conduct a non-random data resampling experiment, revealing that sampling design does shape inferences in black-legged tick population trajectories and how many years it takes to find stable patterns. Specifically, our results show the importance of study length, sampling technique, life stage, and geographic scope in understanding black-legged tick populations, in the absence of standardized surveillance methods. Current public health efforts based on existing black-legged tick datasets must take monitoring study parameters into account, to better understand if and how to use monitoring data to inform decisioning. We also advocate that potential future forecasting initiatives consider these parameters when projecting future black-legged tick population trends. 
    more » « less
  2. Abstract

    Growth of macroscale limnological research has been accompanied by an increase in secondary datasets compiled from multiple sources. We examined patterns of data availability in LAGOS‐NE, a dataset derived from 87 sources, to identify biases in availability of lake water quality data and to consider how such biases might affect perceived patterns at a subcontinental scale. Of eight common water quality parameters, variables indicative of trophic state (Secchi, chlorophyll, and total P) were most abundant in terms of total observations, lakes sampled, and long‐term records, whereas carbon variables (true color and dissolved organic carbon) were scarcest. Most data were collected during summer from larger (≥ 20 ha) lakes over 1–3 yr. Approximately 80% of data for each variable is derived from ~ 20% of sampled lakes. Long‐term (≥ 20 yr) records were rare and spatially clustered. Data availability is linked to major management challenges (eutrophication and acid rain), citizen science, and a few programs that quantify C and N variables. Resampling exercises suggested that correcting for the surface area sampling bias did not substantially change statistical distributions of the eight variables. Further, estimating a lake's long‐term median Secchi, chlorophyll, and total P using average record lengths had high uncertainty, but modest increases in sample size to > 5 yr yielded estimates with manageable error. Although the specific nature of sampling biases may vary among regions, we expect that they are widespread. Thus, large integrated datasets can and should be used to identify tendencies in how lakes are studied and to address these biases as part broad‐scale limnological investigations.

    more » « less
  3. Abstract

    Land degradation is a leading cause of biodiversity loss, and understanding its consequences on freshwater ecosystems remains a priority for improving the effectiveness of restoration practices and ecosystem assessments. Freshwater monitoring programs use macroinvertebrates to assess the biotic effects of degradation and management actions, often using the ratio of observed to expected taxa at a site—O/E—for this purpose. Despite the power of the O/E approach, large amounts of data are required to generate an expectation and it can be difficult to define a threshold value for degraded sites. An alternative assessment tool is phylogenetic diversity, which is widely used in academic biology but rarely applied in management despite empirical correlations between phylogenetic diversity and management targets such as ecosystem structure and function. Here, we use macroinvertebrate data from 1400 watersheds, collected since 1998, to evaluate the potential for phylogenetic metrics to inform evaluations of management practices. These watersheds were chosen because their low disturbance levels and high habitat heterogeneity have made them problematic to assess with O/E. Phylogenetic diversity detected degradation of assemblages and was sensitive enough to parse impacts to inform management actions. This is particularly notable given the phylogenetic metrics, unlike O/E, did not require additional “baseline” data. Site disturbance and broader environmental drivers strongly predicted site phylogenetic structure, providing management objectives to increase site quality. We call on others to consider using phylogenetic diversity to complement existing O/E schemes, particularly in systems where O/E is insufficient to prioritize management objectives.

    more » « less
  4. Bipolar Disorder, a mood disorder with recurrent mania and depression, requires ongoing monitoring and specialty management. Current monitoring strategies are clinically-based, engaging highly specialized medical professionals who are becoming increasingly scarce. Automatic speech-based monitoring via smartphones has the potential to augment clinical monitoring by providing inexpensive and unobtrusive measurements of a patient’s daily life. The success of such an approach is contingent on the ability to successfully utilize “in-the-wild” data. However, most existing work on automatic mood detection uses datasets collected in clinical or laboratory settings. This study presents experiments in automatically detecting depression severity in individuals with Bipolar Disorder using data derived from clinical interviews and from personal conversations. We find that mood assessment is more accurate using data collected from clinical interactions, in part because of their highly structured nature. We demonstrate that although the features that are most effective in clinical interactions do not extend well to personal conversational data, we can identify alternative features relevant in personal conversational speech to detect mood symptom severity. Our results highlight the challenges unique to working with “in-the-wild” data, providing insight into the degree to which the predictive ability of speech features is preserved outside of a clinical interview. 
    more » « less
  5. Summary Highlights

    Explored new calibration subsetting methods and chemometric models in soil spectral modelling.

    Compared the methods and models for 17 soil properties in an understudied area of India.

    Random subsetting was not always optimal; subsetting matters and depends on data characteristics.

    Sparse models from genomics performed better in 75% of cases than a standard method.

    more » « less