skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Estimating Lake Water Volume With Regression and Machine Learning Methods
The volume of a lake is a crucial component in understanding environmental and hydrologic processes. The State of Minnesota (USA) has tens of thousands of lakes, but only a small fraction has readily available bathymetric information. In this paper we develop and test methods for predicting water volume in the lake-rich region of Central Minnesota. We used three different published regression models for predicting lake volume using available data. The first model utilized lake surface area as the sole independent variable. The second model utilized lake surface area but also included an additional independent variable, the average change in land surface area in a designated buffer area surrounding a lake. The third model also utilized lake surface area but assumed the land surface to be a self-affine surface, thus allowing the surface area-lake volume relationship to be governed by a scale defined by the Hurst coefficient. These models all utilized bathymetric data available for 816 lakes across the region of study. The models explained over 80% of the variation in lake volumes. The sum difference between the total predicted lake volume and known volumes were <2%. We applied these models to predicting lake volumes using available independent variables for over 40,000 lakes within the study region. The total lake volumes for the methods ranged from 1,180,000- and 1,200,000-hectare meters. We also investigated machine learning models for estimating the individual lake volumes and found they achieved comparable and slightly better predictive performance than from the three regression analysis methods. A 15-year time series of satellite data for the study region was used to develop a time series of lake surface areas and those were used, with the first regression model, to calculate individual lake volumes and temporal variation in the total lake volume of the study region. The time series of lake volumes quantified the effect on water volume of a dry period that occurred from 2011 to 2012. These models are important both for estimating lake volume, but also provide critical information for scaling up different ecosystem processes that are sensitive to lake bathymetry.  more » « less
Award ID(s):
1934721
PAR ID:
10346169
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Frontiers in Water
Volume:
4
ISSN:
2624-9375
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract. The Pleistocene sand sea on the Arctic Coastal Plain (ACP) ofnorthern Alaska is underlain by an ancient sand dune field, a geologicalfeature that affects regional lake characteristics. Many of these lakes,which cover approximately 20 % of the Pleistocene sand sea, are relativelydeep (up to 25 m). In addition to the natural importance of ACP sand sealakes for water storage, energy balance, and ecological habitat, the needfor winter water for industrial development and exploration activities makeslakes in this region a valuable resource. However, ACP sand sea lakes havereceived little prior study. Here, we collect in situ bathymetric data totest 12 model variants for predicting sand sea lake depth based on analysisof Landsat-8 Operational Land Imager (OLI) images. Lake depth gradients weremeasured at 17 lakes in midsummer 2017 using a Humminbird 798ci HD SI Comboautomatic sonar system. The field-measured data points were compared tored–green–blue (RGB) bands of a Landsat-8 OLI image acquired on 8 August2016 to select and calibrate the most accurate spectral-depth model for eachstudy lake and map bathymetry. Exponential functions using a simple bandratio (with bands selected based on lake turbidity and bed substrate)yielded the most successful model variants. For each lake, the most accuratemodel explained 81.8 % of the variation in depth, on average. Modeled lakebathymetries were integrated with remotely sensed lake surface area toquantify lake water storage volumes, which ranged from 1.056×10-3 to 57.416×10-3 km3. Due to variations in depthmaxima, substrate, and turbidity between lakes, a regional model iscurrently infeasible, rendering necessary the acquisition of additional insitu data with which to develop a regional model solution. Estimating lakewater volumes using remote sensing will facilitate better management ofexpanding development activities and serve as a baseline by which toevaluate future responses to ongoing and rapid climate change in the Arctic.All sonar depth data and modeled lake bathymetry rasters can be freelyaccessed at https://doi.org/10.18739/A2SN01440 (Simpson and Arp, 2018) andhttps://doi.org/10.18739/A2HT2GC6G (Simpson, 2019), respectively. 
    more » « less
  2. Abstract Depth regulates many attributes of aquatic ecosystems, but relatively few lakes are measured, and existing datasets are biased toward large lakes. To address this, we used a large dataset of maximum (Zmax;n = 16,831) and mean (Zmean;n = 5,881) depth observations to create new depth models, focusing on lakes < 1,000 ha. We then used the models to characterize patterns in lake basin shape and volume. We included terrain metrics, water temperature and reflectance, polygon attributes, and other predictors in a random forest model. Our final models generally outperformed existing models (Zmax; root mean square error [RMSE] = 8.0 m andZmean; RMSE = 3.0 m). Our models show that lake depth followed a Pareto distribution, with 2.8 orders of magnitude fewer lakes for an order of magnitude increase in depth. In addition, despite orders of magnitude variation in surface area, most size classes had a modal maximum depth of ~ 5 m. Concave (bowl‐shaped) lake basins represented 79% of all lakes, but lakes were more convex (funnel‐shaped) as surface area increased. Across the conterminous United States, 9.8% of all lake water was within the top meter of the water column, and 48% in the top 10 m. Excluding the Laurentian Great Lakes, we estimate the total volume in the conterminous United States is 1,057–1,294 km3, depending on whetherZmaxorZmeanwas modeled. Lake volume also exhibited substantial geographic variation, with high volumes in the upper Midwest, Northeast, and Florida and low volumes in the southwestern United States. 
    more » « less
  3. Mendoza-Lera, Clara (Ed.)
    The microbial communities of lake sediments have the potential to serve as valuable bioindicators and integrators of watershed land-use and water quality; however, the relative sensitivity of these communities to physio-chemical and geographical parameters must be demonstrated at taxonomic resolutions that are feasible by current sequencing and bioinformatic approaches. The geologically diverse and lake-rich state of Minnesota (USA) is uniquely situated to address this potential because of its variability in ecological region, lake type, and watershed land-use. In this study, we selected twenty lakes with varying physio-chemical properties across four ecological regions of Minnesota. Our objectives were to (i) evaluate the diversity and composition of the bacterial community at the sediment-water interface and (ii) determine how lake location and watershed land-use impact aqueous chemistry and influence bacterial community structure. Our 16S rRNA amplicon data from lake sediment cores, at two depth intervals, data indicate that sediment communities are more likely to cluster by ecological region rather than any individual lake properties ( e . g ., trophic status, total phosphorous concentration, lake depth). However, composition is tied to a given lake, wherein samples from the same core were more alike than samples collected at similar depths across lakes. Our results illustrate the diversity within lake sediment microbial communities and provide insight into relationships between taxonomy, physicochemical, and geographic properties of north temperate lakes. 
    more » « less
  4. Antarctic subglacial lakes can play an important role in ice sheet dynamics, biology, geology, and oceanography, but it is difficult to definitively constrain their character and locations. Subglacial lake locations are related to factors including heat flux, ice surface slope, ice thickness, and bed topography, though these relationships are not fully quantified. Bed topography is particularly important for determining where water flows and accumulates, but digital elevation models of the ice sheet bed rely on interpolation and are unrealistically smooth, biasing estimates of subglacial lake location and surface area. To address this issue, we use geostatistical methods to simulate realistically rough bed topography. We use our simulated topography to predict subglacial lake distribution across the continent using a binomial logistic regression, which uses physical parameters and known lake locations to calculate the probabilities of lake occurrences. Our results suggest that topography models interpolated without appropriate geostatistics overestimate subglacial lake surface area and that total lake surface area is lower than previously predicted. We find that radar‐detected lakes are more likely to occur in the interior of East Antarctica, while altimetry‐detected (active) lakes are expected to be found in West Antarctica and near the grounding line. We observe that radar‐detected lakes have a high correlation with heat flux and ice thickness, while active lakes are associated with higher ice velocity. 
    more » « less
  5. Abstract Aquatic scientists require robust, accurate information about nutrient concentrations and indicators of algal biomass in unsampled lakes in order to understand and predict the effects of global climate and land‐use change. Historically, lake and landscape characteristics have been used as predictor variables in regression models to generate nutrient predictions, but often with significant uncertainty. An alternative approach to improve predictions is to leverage the observed relationship between water clarity and nutrients, which is possible because water clarity is more commonly measured than lake nutrients. We used a joint‐nutrient model that conditioned predictions of total phosphorus, nitrogen, and chlorophyll aon observed water clarity. Our results demonstrated substantial reductions (8–27%; median = 23%) in prediction error when conditioning on water clarity. These models will provide new opportunities for predicting nutrient concentrations of unsampled lakes across broad spatial scales with reduced uncertainty. 
    more » « less