skip to main content


Title: Modeling Coastal Water Clarity Using Landsat‐8 and Sentinel‐2
Abstract

Understanding and attributing changes to water quality is essential to the study and management of coastal ecosystems and the ecological functions they sustain (e.g., primary productivity, predation, and submerged aquatic vegetation growth). However, describing patterns of water clarity—a key aspect of water quality—over meaningful scales in space and time is challenged by high spatial and temporal variability due to natural and anthropogenic processes. Regionally tuned satellite algorithms can provide a more complete understanding of coastal water clarity changes and drivers. In this study, we used open‐access satellite data and low‐cost in situ methods to improve estimates of water clarity in an optically complex coastal water body. Specifically, we created a remote sensing water clarity product by compiling Landsat‐8 and Sentinel‐2 reflectance data with long‐term Secchi depth measurements at 12 sites over 8 years in a shallow turbid coastal lagoon system in Virginia, USA. Our satellite‐based model explained ∼33% of the variation in in situ water clarity. Our approach increases the spatiotemporal coverage of in situ water clarity data and improves estimates from bio‐optical algorithms that overpredicted water clarity. This could lead to a better understanding of water clarity changes and drivers to better predict how water quality will change in the future.

 
more » « less
Award ID(s):
1832221
NSF-PAR ID:
10429674
Author(s) / Creator(s):
 ;  ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
Earth and Space Science
Volume:
10
Issue:
7
ISSN:
2333-5084
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Groundwater discharge transports dissolved constituents to the ocean, affecting coastal carbon budgets and water quality. However, the magnitude and mechanisms of groundwater exchange along rapidly transitioning Arctic coastlines are largely unknown due to limited observations. Here, using first-of-its-kind coastal Arctic groundwater timeseries data, we evaluate the magnitude and drivers of groundwater discharge to Alaska’s Beaufort Sea coast. Darcy flux calculations reveal temporally variable groundwater fluxes, ranging from −6.5 cm d−1(recharge) to 14.1 cm d−1(discharge), with fluctuations in groundwater discharge or aquifer recharge over diurnal and multiday timescales during the open-water season. The average flux during the monitoring period of 4.9 cm d−1is in line with previous estimates, but the maximum discharge exceeds previous estimates by over an order-of-magnitude. While the diurnal fluctuations are small due to the microtidal conditions, multiday variability is large and drives sustained periods of aquifer recharge and groundwater discharge. Results show that wind-driven lagoon water level changes are the dominant mechanism of fluctuations in land–sea hydraulic head gradients and, in turn, groundwater discharge. Given the microtidal conditions, low topographic relief, and limited rainfall along the Beaufort Sea coast, we identify wind as an important forcing mechanism of coastal groundwater discharge and aquifer recharge with implications for nearshore biogeochemistry. This study provides insights into groundwater flux dynamics along this coastline over time and highlights an oft overlooked discharge and circulation mechanism with implications towards refining solute export estimates to coastal Arctic waters.

     
    more » « less
  2. Hydrogen peroxide (H 2 O 2 ) is an important reactive oxygen species (ROS) in natural waters, affecting water quality via participation in metal redox reactions and causing oxidative stress for marine ecosystems. While attempts have been made to better understand H 2 O 2 dynamics in the global ocean, the relative importance of various H 2 O 2 sources and losses remains uncertain. Our model improves previous estimates of photochemical H 2 O 2 production rates by using remotely sensed ocean color to characterize the ultraviolet (UV) radiation field in surface water along with quantitative chemical data for the photochemical efficiency of H 2 O 2 formation. Wavelength- and temperature-dependent efficiency (i.e., apparent quantum yield, AQY) spectra previously reported for a variety of seawater sources, including coastal and oligotrophic stations in Antarctica, the Pacific Ocean at Station ALOHA, the Gulf of Mexico, and several sites along the eastern coast of the United States were compiled to obtain a “marine-average” AQY spectrum. To evaluate our predictions of H 2 O 2 photoproduction in surface waters using this single AQY spectrum, we compared modeled rates to new measured rates from Gulf Stream, coastal, and nearshore river-outflow stations in the South Atlantic Bight, GA, United States; obtaining comparative differences of 33% or less. In our global model, the “marine-average” AQY spectrum was used with modeled solar irradiance, together with satellite-derived surface seawater temperature and UV optical properties, including diffuse attenuation coefficients and dissolved organic matter absorption coefficients estimated with remote sensing-based algorithms. The final product of the model, a monthly climatology of depth-resolved H 2 O 2 photoproduction rates in the surface mixed layer, is reported for the first time and provides an integrated global estimate of ∼21.1 Tmol yr −1 for photochemical H 2 O 2 production. This work has important implications for photo-redox reactions in seawater and improves our understanding of the role of solar irradiation on ROS cycling and the overall oxidation state in the oceans. 
    more » « less
  3. Over the last century, direct human modification has been a major driver of coastal wetland degradation, resulting in widespread losses of wetland vegetation and a transition to open water. High-resolution satellite imagery is widely available for monitoring changes in present-day wetlands; however, understanding the rates of wetland vegetation loss over the last century depends on the use of historical panchromatic aerial photographs. In this study, we compared manual image thresholding and an automated machine learning (ML) method in detecting wetland vegetation and open water from historical panchromatic photographs in the Florida Everglades, a subtropical wetland landscape. We compared the same classes delineated in the historical photographs to 2012 multispectral satellite imagery and assessed the accuracy of detecting vegetation loss over a 72 year timescale (1940 to 2012) for a range of minimum mapping units (MMUs). Overall, classification accuracies were >95% across the historical photographs and satellite imagery, regardless of the classification method and MMUs. We detected a 2.3–2.7 ha increase in open water pixels across all change maps (overall accuracies > 95%). Our analysis demonstrated that ML classification methods can be used to delineate wetland vegetation from open water in low-quality, panchromatic aerial photographs and that a combination of images with different resolutions is compatible with change detection. The study also highlights how evaluating a range of MMUs can identify the effect of scale on detection accuracy and change class estimates as well as in determining the most relevant scale of analysis for the process of interest. 
    more » « less
  4. Obeid, I. (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients with cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do not have access to such data resources must rely on techniques in which existing models can be adapted to new datasets [6]. A preliminary version of this breast corpus release was tested in a pilot study using a baseline machine learning system, ResNet18 [7], that leverages several open-source Python tools. The pilot corpus was divided into three sets: train, development, and evaluation. Portions of these slides were manually annotated [1] using the nine labels in Table 1 [8] to identify five to ten examples of pathological features on each slide. Not every pathological feature is annotated, meaning excluded areas can include focuses particular to these labels that are not used for training. A summary of the number of patches within each label is given in Table 2. To maintain a balanced training set, 1,000 patches of each label were used to train the machine learning model. Throughout all sets, only annotated patches were involved in model development. The performance of this model in identifying all the patches in the evaluation set can be seen in the confusion matrix of classification accuracy in Table 3. The highest performing labels were background, 97% correct identification, and artifact, 76% correct identification. A correlation exists between labels with more than 6,000 development patches and accurate performance on the evaluation set. Additionally, these results indicated a need to further refine the annotation of invasive ductal carcinoma (“indc”), inflammation (“infl”), nonneoplastic features (“nneo”), normal (“norm”) and suspicious (“susp”). This pilot experiment motivated changes to the corpus that will be discussed in detail in this poster presentation. To increase the accuracy of the machine learning model, we modified how we addressed underperforming labels. One common source of error arose with how non-background labels were converted into patches. Large areas of background within other labels were isolated within a patch resulting in connective tissue misrepresenting a non-background label. In response, the annotation overlay margins were revised to exclude benign connective tissue in non-background labels. Corresponding patient reports and supporting immunohistochemical stains further guided annotation reviews. The microscopic diagnoses given by the primary pathologist in these reports detail the pathological findings within each tissue site, but not within each specific slide. The microscopic diagnoses informed revisions specifically targeting annotated regions classified as cancerous, ensuring that the labels “indc” and “dcis” were used only in situations where a micropathologist diagnosed it as such. Further differentiation of cancerous and precancerous labels, as well as the location of their focus on a slide, could be accomplished with supplemental immunohistochemically (IHC) stained slides. When distinguishing whether a focus is a nonneoplastic feature versus a cancerous growth, pathologists employ antigen targeting stains to the tissue in question to confirm the diagnosis. For example, a nonneoplastic feature of usual ductal hyperplasia will display diffuse staining for cytokeratin 5 (CK5) and no diffuse staining for estrogen receptor (ER), while a cancerous growth of ductal carcinoma in situ will have negative or focally positive staining for CK5 and diffuse staining for ER [9]. Many tissue samples contain cancerous and non-cancerous features with morphological overlaps that cause variability between annotators. The informative fields IHC slides provide could play an integral role in machine model pathology diagnostics. Following the revisions made on all the annotations, a second experiment was run using ResNet18. Compared to the pilot study, an increase of model prediction accuracy was seen for the labels indc, infl, nneo, norm, and null. This increase is correlated with an increase in annotated area and annotation accuracy. Model performance in identifying the suspicious label decreased by 25% due to the decrease of 57% in the total annotated area described by this label. A summary of the model performance is given in Table 4, which shows the new prediction accuracy and the absolute change in error rate compared to Table 3. The breast tissue subset we are developing includes 3,505 annotated breast pathology slides from 296 patients. The average size of a scanned SVS file is 363 MB. The annotations are stored in an XML format. A CSV version of the annotation file is also available which provides a flat, or simple, annotation that is easy for machine learning researchers to access and interface to their systems. Each patient is identified by an anonymized medical reference number. Within each patient’s directory, one or more sessions are identified, also anonymized to the first of the month in which the sample was taken. These sessions are broken into groupings of tissue taken on that date (in this case, breast tissue). A deidentified patient report stored as a flat text file is also available. Within these slides there are a total of 16,971 total annotated regions with an average of 4.84 annotations per slide. Among those annotations, 8,035 are non-cancerous (normal, background, null, and artifact,) 6,222 are carcinogenic signs (inflammation, nonneoplastic and suspicious,) and 2,714 are cancerous labels (ductal carcinoma in situ and invasive ductal carcinoma in situ.) The individual patients are split up into three sets: train, development, and evaluation. Of the 74 cancerous patients, 20 were allotted for both the development and evaluation sets, while the remain 34 were allotted for train. The remaining 222 patients were split up to preserve the overall distribution of labels within the corpus. This was done in hope of creating control sets for comparable studies. Overall, the development and evaluation sets each have 80 patients, while the training set has 136 patients. In a related component of this project, slides from the Fox Chase Cancer Center (FCCC) Biosample Repository (https://www.foxchase.org/research/facilities/genetic-research-facilities/biosample-repository -facility) are being digitized in addition to slides provided by Temple University Hospital. This data includes 18 different types of tissue including approximately 38.5% urinary tissue and 16.5% gynecological tissue. These slides and the metadata provided with them are already anonymized and include diagnoses in a spreadsheet with sample and patient ID. We plan to release over 13,000 unannotated slides from the FCCC Corpus simultaneously with v1.0.0 of TUDP. Details of this release will also be discussed in this poster. Few digitally annotated databases of pathology samples like TUDP exist due to the extensive data collection and processing required. The breast corpus subset should be released by November 2021. By December 2021 we should also release the unannotated FCCC data. We are currently annotating urinary tract data as well. We expect to release about 5,600 processed TUH slides in this subset. We have an additional 53,000 unprocessed TUH slides digitized. Corpora of this size will stimulate the development of a new generation of deep learning technology. In clinical settings where resources are limited, an assistive diagnoses model could support pathologists’ workload and even help prioritize suspected cancerous cases. ACKNOWLEDGMENTS This material is supported by the National Science Foundation under grants nos. CNS-1726188 and 1925494. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. REFERENCES [1] N. Shawki et al., “The Temple University Digital Pathology Corpus,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York City, New York, USA: Springer, 2020, pp. 67 104. https://www.springer.com/gp/book/9783030368432. [2] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning.” Major Research Instrumentation (MRI), Division of Computer and Network Systems, Award No. 1726188, January 1, 2018 – December 31, 2021. https://www. isip.piconepress.com/projects/nsf_dpath/. [3] A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” in Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), 2020, pp. 5036-5040. https://doi.org/10.21437/interspeech.2020-3015. [4] C.-J. Wu et al., “Machine Learning at Facebook: Understanding Inference at the Edge,” in Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2019, pp. 331–344. https://ieeexplore.ieee.org/document/8675201. [5] I. Caswell and B. Liang, “Recent Advances in Google Translate,” Google AI Blog: The latest from Google Research, 2020. [Online]. Available: https://ai.googleblog.com/2020/06/recent-advances-in-google-translate.html. [Accessed: 01-Aug-2021]. [6] V. Khalkhali, N. Shawki, V. Shah, M. Golmohammadi, I. Obeid, and J. Picone, “Low Latency Real-Time Seizure Detection Using Transfer Deep Learning,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2021, pp. 1 7. https://www.isip. piconepress.com/publications/conference_proceedings/2021/ieee_spmb/eeg_transfer_learning/. [7] J. Picone, T. Farkas, I. Obeid, and Y. Persidsky, “MRI: High Performance Digital Pathology Using Big Data and Machine Learning,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/nsf/mri_dpath/. [8] I. Hunt, S. Husain, J. Simons, I. Obeid, and J. Picone, “Recent Advances in the Temple University Digital Pathology Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2019, pp. 1–4. https://ieeexplore.ieee.org/document/9037859. [9] A. P. Martinez, C. Cohen, K. Z. Hanley, and X. (Bill) Li, “Estrogen Receptor and Cytokeratin 5 Are Reliable Markers to Separate Usual Ductal Hyperplasia From Atypical Ductal Hyperplasia and Low-Grade Ductal Carcinoma In Situ,” Arch. Pathol. Lab. Med., vol. 140, no. 7, pp. 686–689, Apr. 2016. https://doi.org/10.5858/arpa.2015-0238-OA. 
    more » « less
  5. Abstract

    A multimethod process‐oriented investigation of diverse productivity measures in the California Current Ecosystem (CCE) Long‐Term Ecological Research study region, a complex physical environment, is presented. Seven multiday deployments covering a transition region from high to low productivity were conducted over two field expeditions (spring 2016 and summer 2017). Employing a Lagrangian study design, water parcels were followed over several days, comparing 24‐h in situ measurements (14C and15NO3‐uptake, dilution estimates of phytoplankton growth, and microzooplankton grazing) with high‐resolution productivity measurements by fast repetition rate fluorometry (FRRF) and equilibrium inlet mass spectrometry (EIMS), and integrated carbon export measuremnts using sediment traps. Results show the importance of accounting for temporal and fine spatial scale variability when estimating ecosystem production. FRRF and EIMS measurements resolved diel patterns in gross primary and net community production. Diel productivity changes agreed well with comparably more traditional measurements. While differences in productivity metrics calculated over different time intervals were considerable, as those methods rely on different base assumptions, the data can be used to explain ecosystem processes which would otherwise have gone unnoticed. The processes resolved from this method comparison further understanding of temporal and spatial coupling and decoupling of surface productivity and potential carbon burial in a gradient from coastal to offshore ecosystems.

     
    more » « less