skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Convolutional Neural Network to Classify Phytoplankton Images Along the West Antarctic Peninsula
Abstract High-resolution optical imaging systems are quickly becoming universal tools to characterize and quantify microbial diversity in marine ecosystems. Automated classification systems such as convolutional neural networks (CNNs) are often developed to identify species within the immense number of images (e.g., millions per month) collected. The goal of our study was to develop a CNN to classify phytoplankton images collected with an Imaging FlowCytobot for the Palmer Antarctica Long-Term Ecological Research project. A relatively small CNN (~2 million parameters) was developed and trained using a subset of manually identified images, resulting in an overall test accuracy, recall, and f1-score of 93.8, 93.7, and 93.7%, respectively, on a balanced dataset. However, the f1-score dropped to 46.5% when tested on a dataset of 10,269 new images drawn from the natural environment without balancing classes. This decrease is likely due to highly imbalanced class distributions dominated by smaller, less differentiable cells, high intraclass variance, and interclass morphological similarities of cells in naturally occurring phytoplankton assemblages. As a case study to illustrate the value of the model, it was used to predict taxonomic classifications (ranging from genus to class) of phytoplankton at Palmer Station, Antarctica, from late austral spring to early autumn in 2017‐2018 and 2018‐2019. The CNN was generally able to identify important seasonal dynamics such as the shift from large centric diatoms to small pennate diatoms in both years, which is thought to be driven by increases in glacial meltwater from January to March. This shift in particle size distribution has significant implications for the ecology and biogeochemistry of these waters. Moving forward, we hope to further increase the accuracy of our model to better characterize coastal phytoplankton communities threatened by rapidly changing environmental conditions.  more » « less
Award ID(s):
2026045
PAR ID:
10411787
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Marine Technology Society Journal
Volume:
56
Issue:
5
ISSN:
0025-3324
Page Range / eLocation ID:
45 to 57
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    High-resolution optical imaging systems are quickly becoming universal tools to characterize and quantify microbial diversity in marine ecosystems. Automated detection systems such as convolutional neural networks (CNN) are often developed to identify the immense number of images collected. The goal of our study was to develop a CNN to classify phytoplankton images collected with an Imaging FlowCytobot for the Palmer Antarctica Long-Term Ecological Research project. A medium complexity CNN was developed using a subset of manually-identified images, resulting in an overall accuracy, recall, and f1-score of 93.8%, 93.7%, and 93.7%, respectively. The f1-score dropped to 46.5% when tested on a new random subset of 10,269 images, likely due to highly imbalanced class distributions, high intraclass variance, and interclass morphological similarities of cells in naturally occurring phytoplankton assemblages. Our model was then used to predict taxonomic classifications of phytoplankton at Palmer Station, Antarctica over 2017-2018 and 2018-2019 summer field seasons. The CNN was generally able to capture important seasonal dynamics such as the shift from large centric diatoms to small pennate diatoms in both seasons, which is thought to be driven by increases in glacial meltwater from January to March. Moving forward, we hope to further increase the accuracy of our model to better characterize coastal phytoplankton communities threatened by rapidly changing environmental conditions. 
    more » « less
  2. Abstract In coastal West Antarctic Peninsula (WAP) waters, large phytoplankton blooms in late austral spring fuel a highly productive marine ecosystem. However, WAP atmospheric and oceanic temperatures are rising, winter sea ice extent and duration are decreasing, and summer phytoplankton biomass in the northern WAP has decreased and shifted toward smaller cells. To better understand these relationships, an Imaging FlowCytobot was used to characterize seasonal (spring to autumn) phytoplankton community composition and cell size during a low (2017–2018) and high (2018–2019) chlorophyllayear in relation to physical drivers (e.g., sea ice and meteoric water) at Palmer Station, Antarctica. A shorter sea ice season with early rapid retreat resulted in low phytoplankton biomass with a low proportion of diatoms (2017–2018), while a longer sea ice season with late protracted retreat resulted in the opposite (2018–2019). Despite these differences, phytoplankton seasonal succession was similar in both years: (1) a large‐celled centric diatom bloom during spring sea ice retreat; (2) a peak summer phase comprised of mixotrophic cryptophytes with increases in light and postbloom organic matter; and (3) a late summer phase comprised of small (< 20 μm) diatoms and mixed flagellates with increases in wind‐driven nutrient resuspension. In addition, cell diameter decreased from November to April with increases in meteoric water in both years. The tight coupling between sea ice, meltwater, and phytoplankton species composition suggests that continued warming in the WAP will affect phytoplankton seasonal dynamics, and subsequently seasonal food web dynamics. 
    more » « less
  3. In this article, we propose a deep learning based semantic segmentation model that identifies and segments defects in electroluminescence (EL) images of silicon photovoltaic (PV) cells. The proposed model can differentiate between cracks, contact interruptions, cell interconnect failures, and contact corrosion for both multicrystalline and monocrystalline silicon cells. Our model utilizes a segmentation Deeplabv3 model with a ResNet-50 backbone. It was trained on 17,064 EL images including 256 physically realistic simulated images of PV cells generated to deal with class imbalance. While performing semantic segmentation for five defect classes, this model achieves a weighted F1-score of 0.95, an unweighted F1-score of 0.69, a pixel-level global accuracy of 95.4%, and a mean intersection over union score of 57.3%. In addition, we introduce the UCF EL Defect dataset, a large-scale dataset consisting of 17,064 EL images, which will be publicly available for use by the PV and computer vision research communities. 
    more » « less
  4. In smart manufacturing, semiconductors play an indispensable role in collecting, processing, and analyzing data, ultimately enabling more agile and productive operations. Given the foundational importance of wafers, the purity of a wafer is essential to maintain the integrity of the overall semiconductor fabrication. This study proposes a novel automated visual inspection (AVI) framework for scrutinizing semiconductor wafers from scratch, capable of identifying defective wafers and pinpointing the location of defects through autonomous data annotation. Initially, this proposed methodology leveraged a texture analysis method known as gray-level co-occurrence matrix (GLCM) that categorized wafer images—captured via a stroboscopic imaging system—into distinct scenarios for high- and low-resolution wafer images. GLCM approaches further allowed for a complete separation of low-resolution wafer images into defective and normal wafer images, as well as the extraction of defect images from defective low-resolution wafer images, which were used for training a convolutional neural network (CNN) model. Consequently, the CNN model excelled in localizing defects on defective low-resolution wafer images, achieving an F1 score—the harmonic mean of precision and recall metrics—exceeding 90.1%. In high-resolution wafer images, a background subtraction technique represented defects as clusters of white points. The quantity of these white points determined the defectiveness and pinpointed locations of defects on high-resolution wafer images. Lastly, the CNN implementation further enhanced performance, robustness, and consistency irrespective of variations in the ratio of white point clusters. This technique demonstrated accuracy in localizing defects on high-resolution wafer images, yielding an F1 score greater than 99.3%. 
    more » « less
  5. An increase in volcanic thermal emissions can indicate subsurface and surface processes that precede, or coincide with, volcanic eruptions. Space-borne infrared sensors can detect hotspots—defined here as localized volcanic thermal emissions—in near-real-time. However, automatic hotspot detection systems are needed to efficiently analyze the large quantities of data produced. While hotspots have been automatically detected for over 20 years with simple thresholding algorithms, new computer vision technologies, such as convolutional neural networks (CNNs), can enable improved detection capabilities. Here we introduce HotLINK: the Hotspot Learning and Identification Network, a CNN trained to detect hotspots with a dataset of −3,800 satellite-based, Visible Infrared Imaging Radiometer Suite (VIIRS) images from Mount Veniaminof and Mount Cleveland volcanoes, Alaska. We find that our model achieves an accuracy of 96% (F1-score 0.92) when evaluated on −1,700 unseen images from the same volcanoes, and 95% (F1-score 0.67) when evaluated on −3,000 images from six additional Alaska volcanoes (Augustine Volcano, Bogoslof Island, Okmok Caldera, Pavlof Volcano, Redoubt Volcano, Shishaldin Volcano). In comparison with an existing threshold-based hotspot detection algorithm, MIROVA (Coppola et al., Geological Society, London, Special Publications, 2016, 426, 181–205), our model detects 22% more hotspots and produces 12% fewer false positives. Additional testing on −700 labeled Moderate Resolution Imaging Spectroradiometer (MODIS) images from Mount Veniaminof demonstrates that our model is applicable to this sensor’s data as well, achieving an accuracy of 98% (F1-score 0.95). We apply HotLINK to 10 years of VIIRS data and 22 years of MODIS data for the eight aforementioned Alaska volcanoes and calculate the radiative power of detected hotspots. From these time series we find that HotLINK accurately characterizes background and eruptive periods, similar to MIROVA, but also detects more subtle warming signals, potentially related to volcanic unrest. We identify three advantages to our model over its predecessors: 1) the ability to detect more subtle volcanic hotspots and produce fewer false positives, especially in daytime images; 2) probabilistic predictions provide a measure of detection confidence; and 3) its transferability, i.e., the successful application to multiple sensors and multiple volcanoes without the need for threshold tuning, suggesting the potential for global application. 
    more » « less