skip to main content


Title: LAGOS‐US RESERVOIR : A database classifying conterminous U.S. lakes 4 ha and larger as natural lakes or reservoir lakes
Abstract

The LAGOS‐US RESERVOIR data module classifies all 137,465 lakes ≥ 4 ha in the conterminous U.S. into three categories using a machine learning predictive model based on visual interpretation of lake outlines and a lake shape classification rule. Natural Lakes (NLs) are defined as naturally formed, lacking large, flow‐altering structures; Reservoir Class A's (RSVR_A) are defined as lakes likely human‐made or human‐altered by a large water control structure; and Reservoir Class B's (RSVR_Bs) are lakes likely human‐made but are not connected to streams and have a shape rare in NLs. We trained machine learning models on 12,162 manually classified lakes to predict assignment as an NL or RSVR, then further classified RSVRs based on NHD Fcodes, isolation, and angularity. Our classification indicates that > 46% of lakes ≥ 4 ha in the conterminous U.S. are reservoir lakes. These data can be easily combined with other LAGOS‐US modules and U.S. national databases for the broad‐scale study of reservoir lakes and NLs.

 
more » « less
Award ID(s):
1638679
NSF-PAR ID:
10389351
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Limnology and Oceanography Letters
Volume:
8
Issue:
2
ISSN:
2378-2242
Page Range / eLocation ID:
p. 267-285
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The LAGOS-US RESERVOIR data module (hereafter, RESERVOIR) classifies all 137,465 lakes > 4 hectares in the conterminous U.S. into one of the following three categories using a machine-learning predictive model based on visual interpretation of lake outlines and a classification rule based on lake shape. Natural Lakes (NLs) are defined as lakes that are likely to be entirely or mostly naturally-formed and that do not have large, flow-altering structures on or near them; Reservoir Class A’s (RSVR_A) are defined as lakes that are likely to be either human-made or highly human-altered by the presence of a relatively large water control structure that appears to significantly change the flow of water; and Reservoir Class B’s (RSVR_Bs) are lakes that are likely to be entirely human-made based on isolation from rivers and a highly angular shape that is rarely, if ever, seen in natural lakes also often. We trained the machine learning models on 12,162 manually-classified lakes to assign probabilities of a lake being in 1 of 2 of the categories (NL or RSVR), then we further classified the RSVR classification into either A or B based on NHD Fcodes, isolation, and angularity. The data module includes a detailed User Guide, metadata tables, and a data table that includes information such as location, lake geometry, surface water connectivity class, and official name. Using our definition, our classification indicates that over 46 % of lakes > 4 ha in the conterminous U.S. are reservoir lakes. These data can be combined with other LAGOS-US data modules and U.S. national databases using unique lake identifiers to study both reservoir lakes and natural lakes at broad scales. 
    more » « less
  2. The LAGOS-US LAKE DEPTH v1.0 module (hereafter, called DEPTH) contains in situ measurements of lake depth for a subset of all lakes (n = 17,675) in the conterminous U.S. > 1 ha (3.7% of 479,950) that are in the LAGOS-US LOCUS v1.0 data module (Smith et al. 2021). All 17,675 lakes in DEPTH have a maximum depth value and 6,137 lakes have a mean depth. DEPTH includes approximately 65 data sources obtained from community, government, and university monitoring programs, as well as academic reports and commercial websites. DEPTH includes lake identifiers, lake location, lake area, lake depth (both maximum and mean depth when available), source information, and data flags. The unique lake identifier (lagoslakeid) for all lakes is the same one used in LAGOS-US LOCUS v1.0. 
    more » « less
  3. The LAGOS-US LIMNO data package is one of the core data modules of LAGOS-US, an extensible research-ready platform designed to study the 479,950 lakes and reservoirs larger than or equal to 1 ha in the conterminous US (48 states plus the District of Columbia). The LIMNO module contains in situ observations of 47 parameters of lake physics, chemistry, and biology (hereafter referred to as chemistry) from lake surface samples (defined as observations taken from the epilimnion of a lake) obtained from the Water Quality Portal, the National Lakes Assessment (2007, 2012, 2017), and NEON programs. LIMNO provides 3,511,020 observations across all parameters collected between 1975 and 2021 from 20,329 lakes; the number of observations per lake ranged from 1 to 20,605 with a median of 32. The database design that supports the LAGOS-US research platform was created based on several important design features: lakes are the fundamental unit of consideration, all lakes in the spatial extent above the minimum size must be represented, and most information is connected to individual lakes. The design is modular, interoperable (the modules can be used with each other, as well as other comprehensive lake data products such as the USGS NHD), and extensible (future database modules can be developed and used in the LAGOS-US research platform by others). Users are encouraged to use the other two core data modules that are part of the LAGOS-US platform: LOCUS (location, identifiers, and physical characteristics of lakes and their watersheds) and GEO (characteristics defining geospatial and temporal ecological setting quantified at multiple spatial divisions) that are each found in their own data packages. 
    more » « less
  4. We conducted a macroscale study of 2,210 shallow lakes (mean depth ≤ 3m or a maximum depth ≤ 5m) in the Upper Midwestern and Northeastern U.S. We asked: What are the patterns and drivers of shallow lake total phosphorus (TP), chlorophyll a (CHLa), and TP–CHLa relationships at the macroscale, how do these differ from those for 4,360 non-shallow lakes, and do results differ by hydrologic connectivity class? To answer this question, we assembled the LAGOS-NE Shallow Lakes dataset described herein, a dataset derived from existing LAGOS-NE, LAGOS-DEPTH, and LAGOS-CLIMATE datasets. Response data variables were the median of available summer (e.g., 15 June to 15 September) values of total phosphorus (TP) and chlorophyll a (CHLa). Predictor variables were assembled at two spatial scales for incorporation into hierarchical models. At the local or lake-specific scale (including the individual lake, its inter-lake watershed [iws] or corresponding HU12 watershed), variables included those representing land use/cover, hydrology, climate, morphometry, and acid deposition. At the regional scale (e.g., HU4 watershed), variables included a smaller set of predictor variables for hydrology and land use/cover. The dataset also includes the unique identifier assigned by LAGOS-NE(lagoslakeid); the latitude and longitude of the study lakes; their maximum and mean depths along with a depth classification of Shallow or non-Shallow; connectivity class (i.e., whether a lake was classified as connected (with inlets and outlets) or unconnected (lacking inlets); and the zone id for the HU4 to which each lake belongs. Along with the database, we provide the R scripts for the hierarchical models predicting TP or CHLa (TPorCHL_predictive_model.R), and the TP—CHLa relationship (TP_CHL_CSI_Model.R) for depth and connectivity subsets of the study lakes. 
    more » « less
  5. The phenology of critical biological events in aquatic ecosystems are rapidly shifting due to climate change. Growing variability in phenological cues can increase the likelihood of trophic mismatches, causing recruitment failures in commercially, culturally, and recreationally important fisheries. We tested for changes in spawning phenology of regionally important walleye (Sander vitreus) populations in 194 Midwest US lakes in Minnesota, Michigan, and Wisconsin spanning 1939-2019 to investigate factors influencing walleye phenological responses to climate change and associated climate variability, including ice-off timing, lake physical characteristics, and population stocking history. Data from Wisconsin and Michigan lakes (185 and 5 out of 194 total lakes, respectively) were collected by the Wisconsin Department of Natural Resources (WDNR) and the Great Lakes Indian Fish and Wildlife Commission (GLIFWC) through standardized spring walleye mark-recapture surveys and spring tribal harvest season records. Standardized spring mark-recapture population estimates are performed shortly after ice-off, where following a marking event, a subsequent recapture sampling event is conducted using nighttime electrofishing (typically AC – WDNR, pulsed-DC – GLIFWC) of the entire shoreline including islands for small lakes and index stations for large lakes (Hansen et al. 2015) that is timed to coincide with peak walleye spawning activity (G. Hatzenbeler, WDNR, personal communication; M. Luehring, GLIFWC, personal communication; Beard et al. 1997). Data for four additional Minnesota lakes were collected by the Minnesota Department of Natural Resources (MNDNR) beginning in 1939 during annual collections of walleye eggs and broodstock (Schneider et al. 2010), where date of peak egg take was used to index peak spawning activity. For lakes where spawning location did not match the lake for which the ice-off data was collected, the spawning location either flowed into (Pike River) or was within 50 km of a lake where ice-off data were available (Pine River) and these ice-off data were used. Following the affirmation of off-reservation Ojibwe tribal fishing rights in the Ceded Territories of Wisconsin and the Upper Peninsula of Michigan in 1987, tribal spearfishers have targeted walleye during spring spawning (Mrnak et al. 2018). Nightly harvests are recorded as part of a compulsory creel survey (US Department of the Interior 1991). Using these records, we calculated the date of peak spawning activity in a given lake-year as the day of maximum tribal harvest. Although we were unable to account for varying effort in these data, a preliminary analysis comparing spawning dates estimated using tribal harvest to those determined from standardized agency surveys in the same lake and year showed that they were highly correlated (Pearson’s correlation: r = 0.91, P < 0.001). For lakes that had walleye spawning data from both agency surveys and tribal harvest, we used the data source with the greatest number of observation years. Ice-off phenology data was collected from two sources – either observed from the Global Lake and River Ice Phenology database (Benson et al. 2000)t, or modeled from a USGS region-wide machine-learning model which used North American Land Data Assimilation System (NLDAS) meteorological inputs combined with lake characteristics (lake position, clarity, size, depth, hypsography, etc.) to predict daily water column temperatures from 1979 - 2022, from which ice-off dates could be derived (https://www.sciencebase.gov/catalog/item/6206d3c2d34ec05caca53071; see Corson-Dosch et al. 2023 for details). Modeled data for our study lakes (see (Read et al. 2021) for modeling details), which performed well in reflecting ice phenology when compared to observed data (i.e., highly significant correlation between observed and modeled ice-off dates when both were available; r = 0.71, p < 0.001). Lake surface area (ha), latitude, and maximum depth (m) were acquired from agency databases and lake reports. Lake class was based on a WDNR lakes classification system (Rypel et al. 2019) that categorized lakes based on temperature, water clarity, depth, and fish community. Walleye stocking history was defined using the walleye stocking classification system developed by the Wisconsin Technical Working Group (see also Sass et al. 2021), which categorized lakes based on relative contributions of naturally-produced and stocked fish to adult recruitment by relying heavily on historic records of age-0 and age-1 catch rates and stocking histories. Wisconsin lakes were divided into three groups: natural recruitment (NR), a combination of stocking and natural recruitment (C-ST), and stocked only (ST). Walleye natural recruitment was indexed as age-0 walleye CPE (number of age-0 walleye captured per km of shoreline electrofished) from WDNR and GLIFWC fall electrofishing surveys (see Hansen et al. 2015 for details). We excluded lake-years where stocking of age-0 fish occurred before age-0 surveys to only include measurements of naturally-reproduced fish. 
    more » « less