skip to main content


Title: Using machine learning to correct for nonphotochemical quenching in high‐frequency, in vivo fluorometer data
Abstract

In vivo fluorometers use chlorophyllafluorescence (Fchl) as a proxy to monitor phytoplankton biomass. However, the fluorescence yield ofFchlis affected by photoprotection processes triggered by increased irradiance (nonphotochemical quenching; NPQ), creating diurnal reductions inFchlthat may be mistaken for phytoplankton biomass reductions. Published correction methods are mostly designed for pelagic oceans and are ill suited for inland waters or for high‐frequency data collection. A machine learning‐based method was developed to correct vertical profiler data from an oligotrophic lake. NPQ was estimated as a percent reduction inFchlby comparing daytime values to mean, unquenched values from the previous night. A random forest regression was trained on sensor data collected coincident withFchl; including solar radiation, water temperature, depth, and dissolved oxygen saturation. The accuracy of the model was assessed using a grouped 10‐fold cross validation (mean absolute error [MAE]: 7.6%; root mean square error [RMSE]: 10.2%), which was then used to correctFchlprofiles. The model also predicted NPQ and corrected unseenFchlprofiles from a future period with excellent results (MAE: 9.0%; RMSE: 14.4%).Fchlprofiles were then correlated to laboratory results, allowing corrected profiles to be compared directly to collected samples. The correction reduced error (RMSE) due to NPQ from 0.67 μg L−1to 0.33 μg L−1when compared to uncorrectedFchldata. These results suggest that the use of machine learning models may be an effective way to correct for NPQ and may have universal applicability.

 
more » « less
NSF-PAR ID:
10378523
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley Blackwell (John Wiley & Sons)
Date Published:
Journal Name:
Limnology and Oceanography: Methods
Volume:
18
Issue:
9
ISSN:
1541-5856
Page Range / eLocation ID:
p. 477-494
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Water quality monitoring is relevant for protecting the designated, or beneficial uses, of water such as drinking, aquatic life, recreation, irrigation, and food supply that support the economy, human well-being, and aquatic ecosystem health. Managing finite water resources to support these designated uses requires information on water quality so that managers can make sustainable decisions. Chlorophyll- a (chl- a , µg L −1 ) concentration can serve as a proxy for phytoplankton biomass and may be used as an indicator of increased anthropogenic nutrient stress. Satellite remote sensing may present a complement to in situ measures for assessments of water quality through the retrieval of chl- a with in-water algorithms. Validation of chl- a algorithms across US lakes improves algorithm maturity relevant for monitoring applications. This study compares performance of the Case 2 Regional Coast Colour (C2RCC) chl- a retrieval algorithm, a revised version of the Maximum-Peak Height (MPH (P) ) algorithm, and three scenarios merging these two approaches. Satellite data were retrieved from the MEdium Resolution Imaging Spectrometer (MERIS) and the Ocean and Land Colour Instrument (OLCI), while field observations were obtained from 181 lakes matched with U.S. Water Quality Portal chl- a data. The best performance based on mean absolute multiplicative error (MAE mult ) was demonstrated by the merged algorithm referred to as C 15 −M 10 (MAE mult  = 1.8, bias mult  = 0.97, n  = 836). In the C 15 −M 10 algorithm, the MPH (P) chl- a value was retained if it was > 10 µg L −1 ; if the MPH (P) value was ≤ 10 µg L −1 , the C2RCC value was selected, as long as that value was < 15 µg L −1 . Time-series and lake-wide gradients compared against independent assessments from Lake Champlain and long-term ecological research stations in Wisconsin were used as complementary examples supporting water quality reporting requirements. Trophic state assessments for Wisconsin lakes provided examples in support of inland water quality monitoring applications. This study presents and assesses merged adaptations of chl- a algorithms previously reported independently. Additionally, it contributes to the transition of chl- a algorithm maturity by quantifying error statistics for a number of locations and times. 
    more » « less
  2. Abstract

    To measure chlorophylla(Chla) fluorescence (Fchl), fluorometers use an excitation wavelength that is within the visible spectrum of most zooplankton, and as a result has the potential to cause a phototactic response in zooplankton. The transparent bodies of herbivorous zooplankton may allow viable chlorophyllawithin an individual's digestive tract to fluoresce in response to sensor excitation light, resulting in measurement bias. To test for this bias, a fully factorial (± zooplankton and ± light) experiment was conducted in an oligotrophic lake. Excitation light from fluorometers triggered a positive phototactic response during nighttime hours, resulting in swarms of zooplankton congregating beneath the sensor. The maximum hourly meanFchlfrom nighttime/open treatments was higher and more variable than nighttime/zooplankton exclusion treatments, with the greatest single hour difference of 7.34 relative fluorescence units (RFU) vs. 0.26 RFU. In open treatments, sustained periods ofFchlexceeded 31x the values of exclusion treatments. A second series of experiments pulsed excitation lights in alternating periods in order to characterize zooplankton response times. Sensor bias was detected in as little as 20 s after initial illumination. Collectively, these results suggest that swarms of phototactic zooplankton can cause substantial bias inFchlmeasurements at night. To correct for this bias, post‐processing methods using time series decomposition were demonstrated to remove the majority ofFchlbias.

     
    more » « less
  3. Abstract

    Transitions in phytoplankton community composition are typically attributed to ecological succession even in physically dynamic upwelling systems like the California Current Ecosystem (CCE). An expected succession from a high‐chlorophyll (~ 10μg L−1) diatom‐dominated assemblage to a low‐chlorophyll (< 1.0μg L−1) non‐diatom dominated assemblage was observed during a 2013 summer upwelling event in the CCE. Using an interdisciplinary field‐based space‐for‐time approach leveraging both biogeochemical rate measurements and metatranscriptomics, we suggest that this successional pattern was driven primarily by physical processes. An annually recurring mesoscale eddy‐like feature transported significant quantities of high‐phytoplankton‐biomass coastal water offshore. Chlorophyll was diluted during transport, but diatom contributions to phytoplankton biomass and activity (49–62% observed) did not decline to the extent predicted by dilution (18–24% predicted). Under the space‐for‐time assumption, these trends infer diatom biomass and activity and were stimulated during transport. This is hypothesized to result from decreased contact rates with mortality agents (e.g., viruses) and release from nutrient limitation (confirmed by rate data nearshore), as predicted by the Disturbance‐Recovery hypothesis of phytoplankton bloom formation. Thus, the end point taxonomic composition and activity of the phytoplankton assemblage being transported by the eddy‐like feature were driven by physical processes (mixing) affecting physiological (release from nutrient limitation, increased growth) and ecological (reduced mortality) factors that favored the persistence of the nearshore diatoms during transit. The observed connection between high‐diatom‐biomass coastal waters and non‐diatom‐dominated offshore waters supports the proposed mechanisms for this recurring eddy‐like feature moving seed populations of coastal phytoplankton offshore and thereby sustaining their activity.

     
    more » « less
  4. Abstract

    Two oceanographic cruises were completed in September 2016 and August 2017 to investigate the distribution of particulate organic matter (POM) across the northeast Chukchi Shelf. Both periods were characterized by highly stratified conditions, with major contrasts in the distribution of regional water masses that impacted POM distributions. Overall, surface waters were characterized by low chlorophyll fluorescence (Chl Fl < 0.8 mg m−3) and particle beam attenuation (cp < 0.3 m−1) values, and low concentrations of particulate organic carbon (POC < 8 mmol m−3), chlorophyll and pheophytin (Chl + Pheo < 0.8 mg m−3), and suspended particulate matter (SPM ∼2 g m−3). Elevated Chl Fl and Chl + Pheo (∼2 mg m−3) values measured at mid‐depths below the pycnocline defined the subsurface chlorophyll maxima (SCM), which exhibited moderate POC (∼10 mmol m−3),cp(∼0.4 m−1) and SPM (∼3 g m−3). In contrast, deeper waters below the pycnocline were characterized by low Chl Fl and Chl + Pheo (∼0.7 mg m−3), highcp(>1.5 m−1) and SPM (>8 g m−3) and elevated POC (>10 mmol m−3). POM compositions from surface and SCM regions of the water column were consistent with contributions from active phytoplankton sources whereas samples from bottom waters were characterized by high Pheo/(Chl + Pheo) ratios (>0.4) indicative of altered phytoplankton detritus. Marked contrasts in POM were observed in both surface and middepth waters during both cruises. Increases in chlorophyll and POC consistent with enhanced productivity were measured in middepth waters during the September 2016 cruise following a period of downwelling‐favorable winds, and in surface waters during the August 2017 cruise following a period of upwelling‐favorable winds.

     
    more » « less
  5. In this study, we present a nationwide machine learning model for hourly PM2.5 estimation for the continental United States (US) using high temporal resolution Geostationary Operational Environmental Satellites (GOES-16) Aerosol Optical Depth (AOD) data, meteorological variables from the European Center for Medium Range Weather Forecasting (ECMWF) and ancillary data collected between May 2017 and December 2020. A model sensitivity analysis was conducted on predictor variables to determine the optimal model. It turns out that GOES16 AOD, variables from ECMWF, and ancillary data are effective variables in PM2.5 estimation and historical reconstruction, which achieves an average mean absolute error (MAE) of 3.0 μg/m3, and a root mean square error (RMSE) of 5.8 μg/m3. This study also found that the model performance as well as the site measured PM2.5 concentrations demonstrate strong spatial and temporal patterns. Specifically, in the temporal scale, the model performed best between 8:00 p.m. and 11:00 p.m. (UTC TIME) and had the highest coefficient of determination (R2) in Autumn and the lowest MAE and RMSE in Spring. In the spatial scale, the analysis results based on ancillary data show that the R2 scores correlate positively with the mean measured PM2.5 concentration at monitoring sites. Mean measured PM2.5 concentrations are positively correlated with population density and negatively correlated with elevation. Water, forests, and wetlands are associated with low PM2.5 concentrations, whereas developed, cultivated crops, shrubs, and grass are associated with high PM2.5 concentrations. In addition, the reconstructed PM2.5 surfaces serve as an important data source for pollution event tracking and PM2.5 analysis. For this purpose, from May 2017 to December 2020, hourly PM2.5 estimates were made for 10 km by 10 km and the PM2.5 estimates from August through November 2020 during the period of California Santa Clara Unite (SCU) Lightning Complex fires are presented. Based on the quantitative and visualization results, this study reveals that a number of large wildfires in California had a profound impact on the value and spatial-temporal distributions of PM2.5 concentrations. 
    more » « less