skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Biases in lake water quality sampling and implications for macroscale research
Abstract Growth of macroscale limnological research has been accompanied by an increase in secondary datasets compiled from multiple sources. We examined patterns of data availability in LAGOS‐NE, a dataset derived from 87 sources, to identify biases in availability of lake water quality data and to consider how such biases might affect perceived patterns at a subcontinental scale. Of eight common water quality parameters, variables indicative of trophic state (Secchi, chlorophyll, and total P) were most abundant in terms of total observations, lakes sampled, and long‐term records, whereas carbon variables (true color and dissolved organic carbon) were scarcest. Most data were collected during summer from larger (≥ 20 ha) lakes over 1–3 yr. Approximately 80% of data for each variable is derived from ~ 20% of sampled lakes. Long‐term (≥ 20 yr) records were rare and spatially clustered. Data availability is linked to major management challenges (eutrophication and acid rain), citizen science, and a few programs that quantify C and N variables. Resampling exercises suggested that correcting for the surface area sampling bias did not substantially change statistical distributions of the eight variables. Further, estimating a lake's long‐term median Secchi, chlorophyll, and total P using average record lengths had high uncertainty, but modest increases in sample size to > 5 yr yielded estimates with manageable error. Although the specific nature of sampling biases may vary among regions, we expect that they are widespread. Thus, large integrated datasets can and should be used to identify tendencies in how lakes are studied and to address these biases as part broad‐scale limnological investigations.  more » « less
Award ID(s):
1638679 1638554 1638539
PAR ID:
10371533
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Limnology and Oceanography
Volume:
64
Issue:
4
ISSN:
0024-3590
Format(s):
Medium: X Size: p. 1572-1585
Size(s):
p. 1572-1585
Sponsoring Org:
National Science Foundation
More Like this
  1. This dataset contains water quality measurements made on Green Lakes 1, 2, 3, 4, 5, and Lake Albion. Green Lake 4 was initially sampled in 2000 and is ongoing. Ongoing sampling of Green Lakes 1 was started in 2014 and ongoing sampling of Lake Albion was started in 2016. Water samples were collected for analysis of chlorophyll a and nutrient analysis (which is available in glvwatsolu.dm.data) and field measurements for pH, temperature, specific conductivity, dissolved oxygen (DO), % saturation, secchi depth, PAR. Secchi depth is recorded at the 0m row however it is a measurement of depth and so the units are meters. Most samples were collected between 0800 and 1200 MST. The first sampling date each summer occurs shortly after the ice had melted. Data are collected from an inflatable raft at the point of deepest depth or from the lake inlet and outlet when surface flow is present. The majority of chlorophyll-a the measurements were taken at the surface (0m), the metalimnion (3m), and the hypolimnion nine (usually 8-11m). However, additional measurements were taken for side projects of the long-term dataset during several of the years and are included in this dataset. Water samples from the metalimnion or hypolimnion were collected using a Van Dorne sampler, and surface samples were collected as grab samples from the water column surface, the inlet and outlet. Field measurements were conducted using a YSI either DO or multiple probe meter (2014-2017, YSI MPS 556)(2018-ongoing, YSI ProPlus) and a Li-Cor meter with a quantum sensor. Chlorophyll-a was extracted from filtered samples and absorbance was measured before and after acidification to quantify chlorophyll a concentration. 
    more » « less
  2. LakeBeD-US: Ecology Edition is a harmonized lake water quality dataset containing time series and vertical profiles of 21 lakes in the United States monitored by long-term monitoring institutions. These institutions include the North Temperate Lakes Long-Term Ecological Research program (NTL-LTER), Niwot Ridge Long-Term Ecological Research program (NWT-LTER), National Ecological Observatory Network (NEON), and the Carey Lab at Virginia Tech as part of the Virginia Reservoirs Long-Term Research in Environmental Biology (LTREB) site in collaboration with the Western Virginia Water Authority. The data include depth-discrete observations of 17 water quality variables including temperature, dissolved oxygen, chemical properties, Secchi depth, and more. Observations are divided into data collected by automated sensors at a relatively high temporal frequency and manually sampled data at a relatively low temporal frequency. All data were collected in situ. The data are available as Apache Parquet files, and the included R scripts give guidance on how to utilize and query the dataset in R. LakeBeD-US: Ecology Edition is an ecological science-oriented companion to LakeBeD-US: Computer Science Edition. The Computer Science Edition is available on the Hugging Face Hub. 
    more » « less
  3. Abstract: Aim In this study, we present the results of a project which used Landsat Collection 2 Surface Reflectance data and European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) data to develop a machine learning model to estimate Secchi depth in Lake Yojoa, Honduras. Methods Satellite remote sensing data obtained within a 7-day window of an in situ measurement were matched with in situ Secchi depth measurements and were partitioned into train-test-validate data sets for model development. Results The machine learning model had good (R2= 0.57) agreement and reasonable uncertainty (MAE = 0.58 m) between remotely estimated and in situ observed Secchi depth. Application of the machine learning model increased the monitoring record of Lake Yojoa from 6 years of measured data to a 23-year record. Conclusions This model demonstrates the utility of coordinating in situ sampling schedules of short-term research projects with satellite imagery acquisition schedules in order to increase the temporal coverage of remote sensing derived estimates of water quality in understudied lakes. 
    more » « less
  4. Abstract Southern Ocean (SO) phytoplankton chlorophyll is highly variable on sub‐seasonal time scales. Although the SO is the windiest ocean basin globally, it is not conclusively understood how storms impact SO phytoplankton dynamics. Much of our existing knowledge stems from satellites, but biases due to data gaps from cloud cover and low solar angles remain unquantified. Here, we use ocean–sea‐ice simulations with the Community Earth System Model to quantify the climatological 1997–2018 imprint of storms on chlorophyll and phytoplankton dynamics in the ice‐free SO. Additionally, by comparing the full‐field model output to synthetic satellite observations, we quantify sampling biases in satellite‐derived estimates. We find that both the sign and the magnitude of the average surface chlorophyll imprint vary substantially across storms but last for at least 4 days after the storm passing. Based on our analysis, more than one third of the storms explain the majority of local non‐seasonal chlorophyll variability, but satellite‐derived storm imprints are often too large in magnitude. On the day of the storm passing, changes in vertical mixing predominantly cause surface chlorophyll anomalies, and reduced light availability due to enhanced cloud cover outweighs the enhanced nutrient availability due to entrainment. Interestingly, storms imprint differently on total net primary production than on surface chlorophyll, demonstrating the difficulty to derive carbon‐cycle impacts from a surface‐chlorophyll assessment. With SO future storm activity projected to increase, complementing satellite observations with other observing technologies, for example, profiling floats, is necessary to better constrain how storms impact biological carbon cycling in the SO. 
    more » « less
  5. Rapid changes in climate and land use are having substantial and interacting impacts on lake water quality around the world. Here, we synthesized time-series data for dissolved oxygen, temperature, chlorophyll-a, total phosphorus, total nitrogen, and dissolved organic carbon at multiple depths in 822 lakes to facilitate analyses of these changes. The dataset extends from 1921–2022, with a median data duration of 29 years (range 5-102) and a median of 5 unique sampling dates per year at each lake. Lakes in the dataset have a median depth of 12.5 m (range 1.5–480 m), median surface area of 85.4 ha (range: 0.5–237000 ha) and median elevation of 264 m (range: -215–2804). The lakes are located in 18 countries across 5 continents, with latitudes ranging from -42.6 to 68.3. To facilitate interoperability with other large-scale datasets, each lake is linked to a unique hydroLAKES lake ID when possible (n = 683). 
    more » « less