skip to main content


Title: A Convolutional Neural Network to Classify Phytoplankton Images Along the West Antarctic Peninsula
Abstract High-resolution optical imaging systems are quickly becoming universal tools to characterize and quantify microbial diversity in marine ecosystems. Automated classification systems such as convolutional neural networks (CNNs) are often developed to identify species within the immense number of images (e.g., millions per month) collected. The goal of our study was to develop a CNN to classify phytoplankton images collected with an Imaging FlowCytobot for the Palmer Antarctica Long-Term Ecological Research project. A relatively small CNN (~2 million parameters) was developed and trained using a subset of manually identified images, resulting in an overall test accuracy, recall, and f1-score of 93.8, 93.7, and 93.7%, respectively, on a balanced dataset. However, the f1-score dropped to 46.5% when tested on a dataset of 10,269 new images drawn from the natural environment without balancing classes. This decrease is likely due to highly imbalanced class distributions dominated by smaller, less differentiable cells, high intraclass variance, and interclass morphological similarities of cells in naturally occurring phytoplankton assemblages. As a case study to illustrate the value of the model, it was used to predict taxonomic classifications (ranging from genus to class) of phytoplankton at Palmer Station, Antarctica, from late austral spring to early autumn in 2017‐2018 and 2018‐2019. The CNN was generally able to identify important seasonal dynamics such as the shift from large centric diatoms to small pennate diatoms in both years, which is thought to be driven by increases in glacial meltwater from January to March. This shift in particle size distribution has significant implications for the ecology and biogeochemistry of these waters. Moving forward, we hope to further increase the accuracy of our model to better characterize coastal phytoplankton communities threatened by rapidly changing environmental conditions.  more » « less
Award ID(s):
2026045
NSF-PAR ID:
10411787
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Marine Technology Society Journal
Volume:
56
Issue:
5
ISSN:
0025-3324
Page Range / eLocation ID:
45 to 57
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    High-resolution optical imaging systems are quickly becoming universal tools to characterize and quantify microbial diversity in marine ecosystems. Automated detection systems such as convolutional neural networks (CNN) are often developed to identify the immense number of images collected. The goal of our study was to develop a CNN to classify phytoplankton images collected with an Imaging FlowCytobot for the Palmer Antarctica Long-Term Ecological Research project. A medium complexity CNN was developed using a subset of manually-identified images, resulting in an overall accuracy, recall, and f1-score of 93.8%, 93.7%, and 93.7%, respectively. The f1-score dropped to 46.5% when tested on a new random subset of 10,269 images, likely due to highly imbalanced class distributions, high intraclass variance, and interclass morphological similarities of cells in naturally occurring phytoplankton assemblages. Our model was then used to predict taxonomic classifications of phytoplankton at Palmer Station, Antarctica over 2017-2018 and 2018-2019 summer field seasons. The CNN was generally able to capture important seasonal dynamics such as the shift from large centric diatoms to small pennate diatoms in both seasons, which is thought to be driven by increases in glacial meltwater from January to March. Moving forward, we hope to further increase the accuracy of our model to better characterize coastal phytoplankton communities threatened by rapidly changing environmental conditions. 
    more » « less
  2. Abstract

    In coastal West Antarctic Peninsula (WAP) waters, large phytoplankton blooms in late austral spring fuel a highly productive marine ecosystem. However, WAP atmospheric and oceanic temperatures are rising, winter sea ice extent and duration are decreasing, and summer phytoplankton biomass in the northern WAP has decreased and shifted toward smaller cells. To better understand these relationships, an Imaging FlowCytobot was used to characterize seasonal (spring to autumn) phytoplankton community composition and cell size during a low (2017–2018) and high (2018–2019) chlorophyllayear in relation to physical drivers (e.g., sea ice and meteoric water) at Palmer Station, Antarctica. A shorter sea ice season with early rapid retreat resulted in low phytoplankton biomass with a low proportion of diatoms (2017–2018), while a longer sea ice season with late protracted retreat resulted in the opposite (2018–2019). Despite these differences, phytoplankton seasonal succession was similar in both years: (1) a large‐celled centric diatom bloom during spring sea ice retreat; (2) a peak summer phase comprised of mixotrophic cryptophytes with increases in light and postbloom organic matter; and (3) a late summer phase comprised of small (< 20 μm) diatoms and mixed flagellates with increases in wind‐driven nutrient resuspension. In addition, cell diameter decreased from November to April with increases in meteoric water in both years. The tight coupling between sea ice, meltwater, and phytoplankton species composition suggests that continued warming in the WAP will affect phytoplankton seasonal dynamics, and subsequently seasonal food web dynamics.

     
    more » « less
  3. State-of-the-art deep learning technology has been successfully applied to relatively small selected areas of very high spatial resolution (0.15 and 0.25 m) optical aerial imagery acquired by a fixed-wing aircraft to automatically characterize ice-wedge polygons (IWPs) in the Arctic tundra. However, any mapping of IWPs at regional to continental scales requires images acquired on different sensor platforms (particularly satellite) and a refined understanding of the performance stability of the method across sensor platforms through reliable evaluation assessments. In this study, we examined the transferability of a deep learning Mask Region-Based Convolutional Neural Network (R-CNN) model for mapping IWPs in satellite remote sensing imagery (~0.5 m) covering 272 km2 and unmanned aerial vehicle (UAV) (0.02 m) imagery covering 0.32 km2. Multi-spectral images were obtained from the WorldView-2 satellite sensor and pan-sharpened to ~0.5 m, and a 20 mp CMOS sensor camera onboard a UAV, respectively. The training dataset included 25,489 and 6022 manually delineated IWPs from satellite and fixed-wing aircraft aerial imagery near the Arctic Coastal Plain, northern Alaska. Quantitative assessments showed that individual IWPs were correctly detected at up to 72% and 70%, and delineated at up to 73% and 68% F1 score accuracy levels for satellite and UAV images, respectively. Expert-based qualitative assessments showed that IWPs were correctly detected at good (40–60%) and excellent (80–100%) accuracy levels for satellite and UAV images, respectively, and delineated at excellent (80–100%) level for both images. We found that (1) regardless of spatial resolution and spectral bands, the deep learning Mask R-CNN model effectively mapped IWPs in both remote sensing satellite and UAV images; (2) the model achieved a better accuracy in detection with finer image resolution, such as UAV imagery, yet a better accuracy in delineation with coarser image resolution, such as satellite imagery; (3) increasing the number of training data with different resolutions between the training and actual application imagery does not necessarily result in better performance of the Mask R-CNN in IWPs mapping; (4) and overall, the model underestimates the total number of IWPs particularly in terms of disjoint/incomplete IWPs. 
    more » « less
  4. Abstract

    Pollen identification is necessary for several subfields of geology, ecology, and evolutionary biology. However, the existing methods for pollen identification are laborious, time-consuming, and require highly skilled scientists. Therefore, there is a pressing need for an automated and accurate system for pollen identification, which can be beneficial for both basic research and applied issues such as identifying airborne allergens. In this study, we propose a deep learning (DL) approach to classify pollen grains in the Great Basin Desert, Nevada, USA. Our dataset consisted of 10,000 images of 40 pollen species. To mitigate the limitations imposed by the small volume of our training dataset, we conducted an in-depth comparative analysis of numerous pre-trained Convolutional Neural Network (CNN) architectures utilizing transfer learning methodologies. Simultaneously, we developed and incorporated an innovative CNN model, serving to augment our exploration and optimization of data modeling strategies. We applied different architectures of well-known pre-trained deep CNN models, including AlexNet, VGG-16, MobileNet-V2, ResNet (18, 34, and 50, 101), ResNeSt (50, 101), SE-ResNeXt, and Vision Transformer (ViT), to uncover the most promising modeling approach for the classification of pollen grains in the Great Basin. To evaluate the performance of the pre-trained deep CNN models, we measured accuracy, precision, F1-Score, and recall. Our results showed that the ResNeSt-110 model achieved the best performance, with an accuracy of 97.24%, precision of 97.89%, F1-Score of 96.86%, and recall of 97.13%. Our results also revealed that transfer learning models can deliver better and faster image classification results compared to traditional CNN models built from scratch. The proposed method can potentially benefit various fields that rely on efficient pollen identification. This study demonstrates that DL approaches can improve the accuracy and efficiency of pollen identification, and it provides a foundation for further research in the field.

     
    more » « less
  5. Human mesenchymal stem cells (hMSCs) are multipotent progenitor cells with the potential to differentiate into various cell types, including osteoblasts, chondrocytes, and adipocytes. These cells have been extensively employed in the field of cell-based therapies and regenerative medicine due to their inherent attributes of self-renewal and multipotency. Traditional approaches for assessing hMSCs differentiation capacity have relied heavily on labor-intensive techniques, such as RT-PCR, immunostaining, and Western blot, to identify specific biomarkers. However, these methods are not only time-consuming and economically demanding, but also require the fixation of cells, resulting in the loss of temporal data. Consequently, there is an emerging need for a more efficient and precise approach to predict hMSCs differentiation in live cells, particularly for osteogenic and adipogenic differentiation. In response to this need, we developed innovative approaches that combine live-cell imaging with cutting-edge deep learning techniques, specifically employing a convolutional neural network (CNN) to meticulously classify osteogenic and adipogenic differentiation. Specifically, four notable pre-trained CNN models, VGG 19, Inception V3, ResNet 18, and ResNet 50, were developed and tested for identifying adipogenic and osteogenic differentiated cells based on cell morphology changes. We rigorously evaluated the performance of these four models concerning binary and multi-class classification of differentiated cells at various time intervals, focusing on pivotal metrics such as accuracy, the area under the receiver operating characteristic curve (AUC), sensitivity, precision, and F1-score. Among these four different models, ResNet 50 has proven to be the most effective choice with the highest accuracy (0.9572 for binary, 0.9474 for multi-class) and AUC (0.9958 for binary, 0.9836 for multi-class) in both multi-class and binary classification tasks. Although VGG 19 matched the accuracy of ResNet 50 in both tasks, ResNet 50 consistently outperformed it in terms of AUC, underscoring its superior effectiveness in identifying differentiated cells. Overall, our study demonstrated the capability to use a CNN approach to predict stem cell fate based on morphology changes, which will potentially provide insights for the application of cell-based therapy and advance our understanding of regenerative medicine.

     
    more » « less