skip to main content


Title: The importance of making testable predictions: A cautionary tale
We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them.  more » « less
Award ID(s):
1637632 1655203 1660584
NSF-PAR ID:
10233986
Author(s) / Creator(s):
; ; ; ; ; ;
Editor(s):
Belgrano, Andrea
Date Published:
Journal Name:
PLOS ONE
Volume:
15
Issue:
12
ISSN:
1932-6203
Page Range / eLocation ID:
e0236541
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Summary

    TheSynechococcuscyanobacterial population at the Scripps Institution of Oceanography pier in La Jolla, CA, shows large increases in abundance, typically in the spring and summer followed, by rapid declines within weeks. Here we used amplicon sequencing of the ribosomal RNA internal transcribed spacer region to examine the microdiversity within this cyanobacterial genus during these blooms as well as further offshore in the Southern California coastal ecosystem (CCE). These analyses revealed numerousSynechococcusamplicon sequence variants (ASVs) and that clade and ASV composition can change over the course of blooms. We also found that a large bloom in August 2016 was highly anomalous both in its overallSynechococcusabundance and in terms of the presence of normally oligotrophicSynechococcusclade II. The dominant ASVs at the pier were found further offshore and in the California Current, but we did observe more oligotrophic ASVs and clades along with depth variation inSynechococcusdiversity. We also observed that the dominant sequence variant switched during the peak of multipleSynechococcusblooms, with this switch occurring in multiple clades, but we present initial evidence that this apparent ASV switch is a physiological response rather than a change in the dominant population.

     
    more » « less
  2. The phenology of critical biological events in aquatic ecosystems are rapidly shifting due to climate change. Growing variability in phenological cues can increase the likelihood of trophic mismatches, causing recruitment failures in commercially, culturally, and recreationally important fisheries. We tested for changes in spawning phenology of regionally important walleye (Sander vitreus) populations in 194 Midwest US lakes in Minnesota, Michigan, and Wisconsin spanning 1939-2019 to investigate factors influencing walleye phenological responses to climate change and associated climate variability, including ice-off timing, lake physical characteristics, and population stocking history. Data from Wisconsin and Michigan lakes (185 and 5 out of 194 total lakes, respectively) were collected by the Wisconsin Department of Natural Resources (WDNR) and the Great Lakes Indian Fish and Wildlife Commission (GLIFWC) through standardized spring walleye mark-recapture surveys and spring tribal harvest season records. Standardized spring mark-recapture population estimates are performed shortly after ice-off, where following a marking event, a subsequent recapture sampling event is conducted using nighttime electrofishing (typically AC – WDNR, pulsed-DC – GLIFWC) of the entire shoreline including islands for small lakes and index stations for large lakes (Hansen et al. 2015) that is timed to coincide with peak walleye spawning activity (G. Hatzenbeler, WDNR, personal communication; M. Luehring, GLIFWC, personal communication; Beard et al. 1997). Data for four additional Minnesota lakes were collected by the Minnesota Department of Natural Resources (MNDNR) beginning in 1939 during annual collections of walleye eggs and broodstock (Schneider et al. 2010), where date of peak egg take was used to index peak spawning activity. For lakes where spawning location did not match the lake for which the ice-off data was collected, the spawning location either flowed into (Pike River) or was within 50 km of a lake where ice-off data were available (Pine River) and these ice-off data were used. Following the affirmation of off-reservation Ojibwe tribal fishing rights in the Ceded Territories of Wisconsin and the Upper Peninsula of Michigan in 1987, tribal spearfishers have targeted walleye during spring spawning (Mrnak et al. 2018). Nightly harvests are recorded as part of a compulsory creel survey (US Department of the Interior 1991). Using these records, we calculated the date of peak spawning activity in a given lake-year as the day of maximum tribal harvest. Although we were unable to account for varying effort in these data, a preliminary analysis comparing spawning dates estimated using tribal harvest to those determined from standardized agency surveys in the same lake and year showed that they were highly correlated (Pearson’s correlation: r = 0.91, P < 0.001). For lakes that had walleye spawning data from both agency surveys and tribal harvest, we used the data source with the greatest number of observation years. Ice-off phenology data was collected from two sources – either observed from the Global Lake and River Ice Phenology database (Benson et al. 2000)t, or modeled from a USGS region-wide machine-learning model which used North American Land Data Assimilation System (NLDAS) meteorological inputs combined with lake characteristics (lake position, clarity, size, depth, hypsography, etc.) to predict daily water column temperatures from 1979 - 2022, from which ice-off dates could be derived (https://www.sciencebase.gov/catalog/item/6206d3c2d34ec05caca53071; see Corson-Dosch et al. 2023 for details). Modeled data for our study lakes (see (Read et al. 2021) for modeling details), which performed well in reflecting ice phenology when compared to observed data (i.e., highly significant correlation between observed and modeled ice-off dates when both were available; r = 0.71, p < 0.001). Lake surface area (ha), latitude, and maximum depth (m) were acquired from agency databases and lake reports. Lake class was based on a WDNR lakes classification system (Rypel et al. 2019) that categorized lakes based on temperature, water clarity, depth, and fish community. Walleye stocking history was defined using the walleye stocking classification system developed by the Wisconsin Technical Working Group (see also Sass et al. 2021), which categorized lakes based on relative contributions of naturally-produced and stocked fish to adult recruitment by relying heavily on historic records of age-0 and age-1 catch rates and stocking histories. Wisconsin lakes were divided into three groups: natural recruitment (NR), a combination of stocking and natural recruitment (C-ST), and stocked only (ST). Walleye natural recruitment was indexed as age-0 walleye CPE (number of age-0 walleye captured per km of shoreline electrofished) from WDNR and GLIFWC fall electrofishing surveys (see Hansen et al. 2015 for details). We excluded lake-years where stocking of age-0 fish occurred before age-0 surveys to only include measurements of naturally-reproduced fish. 
    more » « less
  3. Abstract Background Anadromous rainbow smelt ( Osmerus mordax ) have experienced a large range reduction in recent decades and the status of remnant spawning populations is poorly known in Maine, where these fish have significant ecological, cultural, and commercial relevance. Defining the remnant range of anadromous smelt is more difficult than for many declining fish species because adults are only ephemerally present while spawning in small coastal streams at night during spring runoff periods when traditional assessments can be unreliable or even hazardous. We hypothesized that eDNA might facilitate improved survey efforts to define smelt spawning habitat, but that detection could also face challenges from adult eDNA quickly flushing out of these small stream systems. We combined daytime eDNA sampling with nighttime fyke netting to ascertain a potential window of eDNA detection before conducting eDNA surveys in four streams of varying abundance. Hierarchical occupancy modeling was in turn employed to estimate eDNA encounter probabilities relative to numbers of sampling events (date), samples within events, and qPCR replicates within samples. Results Results from the combined eDNA and fyke net study indicated eDNA was detectable over an extended period, culminating approximately 8–13 days following peak spawning, suggesting developing smelt larvae might be the primary source of eDNA. Subsequently, smelt eDNA was readily detected in eDNA surveys of four streams, particularly following remediation of PCR inhibitors. Hierarchical occupancy modeling confirmed our surveys had high empirical detection for most sites, and that future surveys employing at least three sampling events, three samples per event, and six qPCR replicates can afford greater than 90% combined detection capability in low abundance systems. Conclusions These results demonstrate that relatively modest eDNA sampling effort has high capacity to detect this ephemerally present species of concern at low to moderate abundances. As such, smelt eDNA detection could improve range mapping by providing longer survey windows, safer sampling conditions, and lower field effort in low density systems, than afforded by existing visual and netting approaches. 
    more » « less
  4. Abstract

    Between October 2018 ‐ May 2019, sea surface temperature conditions in the central‐eastern tropical Pacific indicated a mild El Niño event. In May 2019, the global El Niño Southern Oscillation (ENSO) forecast consensus was that these generally weak warm patterns will persist at least until the end of the northern hemisphere summer. El Niño and its impact on local climatic conditions in southern coastal Ecuador influence the inter‐annual transmission of dengue fever in the region. In this study, we use an ENSO model to issue forecasts of El Niño for the year 2019, which are then used to predict local climate variables, precipitation and minimum temperature, in the city of Machala, Ecuador. All these forecasts are incorporated in a dengue transmission model, specifically developed and tested for this area, to produce out‐of‐sample predictions of dengue risk. Predictions are issued at the beginning of January 2019 for the whole year, thus providing the longest forecast lead time of 12 months. Preliminary results indicate that the mild and ongoing El Niño event did not provide the optimum climate conditions for dengue transmission, with the model predicting a very low probability of a dengue outbreak during the typical peak season in Machala in 2019. This is contrary to 2016, when a large El Niño event resulted in excess rainfall and warmer temperatures in the region, and a dengue outbreak occurred 3 months earlier than expected. This event was successfully predicted using a similar prediction framework to the one applied here. With the present study, we continue our efforts to build and test a climate service tool to issue early warnings of dengue outbreaks in the region.

     
    more » « less
  5. Abstract

    Remote sensing imagery can provide critical information on the magnitude and extent of damage caused by forest pests and pathogens. However, monitoring short‐term changes in deciduous forest condition caused by defoliating insects is challenging and requires approaches that directly account for seasonal vegetation dynamics. We implemented a previously published harmonic modeling approach for forest condition monitoring in Google Earth Engine and systematically assessed the relative ability of condition change products generated using various model parameterizations for predicting pest abundances and defoliation during the 2016–2018 gypsy moth (Lymantria dispar) outbreak in southern New England. Our comparisons revealed that most models made reasonable predictions of changes in canopy condition and egg and larval abundances ofL. dispar, indicating a strong correlation between our harmonic‐based estimates of condition change and defoliator activity. The greatest differences in predictive ability were in the spectral domain, with assessments based on Tasseled Cap Greenness, Simple Ratio, and the Enhanced Vegetation Index ranking among the top models, and the commonly used Normalized Difference Vegetation Index consistently exhibiting poorer performance. We also observed notable differences in the magnitude of scores for different baseline periods. Additionally, we found that Landsat‐based condition scores better explained larval abundance than egg mass counts, which have historically been used as a proxy for later‐season larval abundance, indicating that our remote sensing approach may be more accurate and cost‐effective for generating consistent retrospective assessments ofL. disparpopulation abundance in addition to estimates of canopy damage. These findings provide important linkages between spectral changes detected using a harmonic modeling approach and biophysical aspects of defoliator activity, with potential to extend monitoring and prediction to regional or even continental scales.

     
    more » « less