Abstract We assess whether a supervised machine learning algorithm, specifically a convolutional neural network (CNN), achieves higher accuracy on planktonic image classification when including non‐plankton and ancillary plankton during the training procedure. We focus on the case of optimizing the CNN for a single planktonic image source, while considering ancillary images to be plankton images from other instruments. We conducted two sets of experiments with three different types of plankton images (from aZooglider, Underwater Vision Profiler 5, and Zooscan), and our results held across all three image types. First, we considered whether single‐stage transfer learning using non‐plankton images was beneficial. For this assessment, we used ImageNet images and the 2015 ImageNet contest‐winning model, ResNet‐152. We found increased accuracy using a ResNet‐152 model pretrained on ImageNet, provided the entire network was retrained rather than retraining only the fully connected layers. Next, we combined all three plankton image types into a single dataset with 3.3 million images (despite their differences in contrast, resolution, and pixel pitch) and conducted a multistage transfer learning assessment. We executed a transfer learning stage from ImageNet to the merged ancillary plankton dataset, then a second transfer learning stage from that merged plankton model to a single instrument dataset. We found that multistage transfer learning resulted in additional accuracy gains. These results should have generality for other image classification tasks.
more »
« less
Taming the data deluge: A novel end‐to‐end deep learning system for classifying marine biological and environmental images
Abstract Underwater imaging enables nondestructive plankton sampling at frequencies, durations, and resolutions unattainable by traditional methods. These systems necessitate automated processes to identify organisms efficiently. Early underwater image processing used a standard approach: binarizing images to segment targets, then integrating deep learning models for classification. While intuitive, this infrastructure has limitations in handling high concentrations of biotic and abiotic particles, rapid changes in dominant taxa, and highly variable target sizes. To address these challenges, we introduce a new framework that starts with a scene classifier to capture large within‐image variation, such as disparities in the layout of particles and dominant taxa. After scene classification, scene‐specific Mask regional convolutional neural network (Mask R‐CNN) models are trained to separate target objects into different groups. The procedure allows information to be extracted from different image types, while minimizing potential bias for commonly occurring features. Using in situ coastal plankton images, we compared the scene‐specific models to the Mask R‐CNN model encompassing all scene categories as a single full model. Results showed that the scene‐specific approach outperformed the full model by achieving a 20% accuracy improvement in complex noisy images. The full model yielded counts that were up to 78% lower than those enumerated by the scene‐specific model for some small‐sized plankton groups. We further tested the framework on images from a benthic video camera and an imaging sonar system with good results. The integration of scene classification, which groups similar images together, can improve the accuracy of detection and classification for complex marine biological images.
more »
« less
- Award ID(s):
- 2224702
- PAR ID:
- 10545276
- Publisher / Repository:
- Association for the Sciences of Limnology and Oceanography
- Date Published:
- Journal Name:
- Limnology and Oceanography: Methods
- Volume:
- 22
- Issue:
- 1
- ISSN:
- 1541-5856
- Page Range / eLocation ID:
- 47 to 64
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Zelinski, Michael E.; Taha, Tarek M.; Howe, Jonathan (Ed.)Image classification forms an important class of problems in machine learning and is widely used in many realworld applications, such as medicine, ecology, astronomy, and defense. Convolutional neural networks (CNNs) are machine learning techniques designed for inputs with grid structures, e.g., images, whose features are spatially correlated. As such, CNNs have been demonstrated to be highly effective approaches for many image classification problems and have consistently outperformed other approaches in many image classification and object detection competitions. A particular challenge involved in using machine learning for classifying images is measurement data loss in the form of missing pixels, which occurs in settings where scene occlusions are present or where the photodetectors in the imaging system are partially damaged. In such cases, the performance of CNN models tends to deteriorate or becomes unreliable even when the perturbations to the input image are small. In this work, we investigate techniques for improving the performance of CNN models for image classification with missing data. In particular, we explore training on a variety of data alterations that mimic data loss for producing more robust classifiers. By optimizing the categorical cross-entropy loss function, we demonstrate through numerical experiments on the MNIST dataset that training with these synthetic alterations can enhance the classification accuracy of our CNN models.more » « less
-
The microtopography associated with ice-wedge polygons governs many aspects of Arctic ecosystem, permafrost, and hydrologic dynamics from local to regional scales owing to the linkages between microtopography and the flow and storage of water, vegetation succession, and permafrost dynamics. Wide-spread ice-wedge degradation is transforming low-centered polygons into high-centered polygons at an alarming rate. Accurate data on spatial distribution of ice-wedge polygons at a pan-Arctic scale are not yet available, despite the availability of sub-meter-scale remote sensing imagery. This is because the necessary spatial detail quickly produces data volumes that hamper both manual and semi-automated mapping approaches across large geographical extents. Accordingly, transforming big imagery into ‘science-ready’ insightful analytics demands novel image-to-assessment pipelines that are fueled by advanced machine learning techniques and high-performance computational resources. In this exploratory study, we tasked a deep-learning driven object instance segmentation method (i.e., the Mask R-CNN) with delineating and classifying ice-wedge polygons in very high spatial resolution aerial orthoimagery. We conducted a systematic experiment to gauge the performances and interoperability of the Mask R-CNN across spatial resolutions (0.15 m to 1 m) and image scene contents (a total of 134 km2) near Nuiqsut, Northern Alaska. The trained Mask R-CNN reported mean average precisions of 0.70 and 0.60 at thresholds of 0.50 and 0.75, respectively. Manual validations showed that approximately 95% of individual ice-wedge polygons were correctly delineated and classified, with an overall classification accuracy of 79%. Our findings show that the Mask R-CNN is a robust method to automatically identify ice-wedge polygons from fine-resolution optical imagery. Overall, this automated imagery-enabled intense mapping approach can provide a foundational framework that may propel future pan-Arctic studies of permafrost thaw, tundra landscape evolution, and the role of high latitudes in the global climate system.more » « less
-
e apply a new deep learning technique to detect, classify, and deblend sources in multi-band astronomical images. We train and evaluate the performance of an artificial neural network built on the Mask R-CNN image processing framework, a general code for efficient object detection, classification, and instance segmentation. After evaluating the performance of our network against simulated ground truth images for star and galaxy classes, we find a precision of 92% at 80% recall for stars and a precision of 98% at 80% recall for galaxies in a typical field with ∼30 galaxies/arcmin2. We investigate the deblending capability of our code, and find that clean deblends are handled robustly during object masking, even for significantly blended sources. This technique, or extensions using similar network architectures, may be applied to current and future deep imaging surveys such as LSST and WFIRST. Our code, Astro R-CNN, is publicly available at https://github.com/burke86/astro_rcnnmore » « less
-
State-of-the-art deep learning technology has been successfully applied to relatively small selected areas of very high spatial resolution (0.15 and 0.25 m) optical aerial imagery acquired by a fixed-wing aircraft to automatically characterize ice-wedge polygons (IWPs) in the Arctic tundra. However, any mapping of IWPs at regional to continental scales requires images acquired on different sensor platforms (particularly satellite) and a refined understanding of the performance stability of the method across sensor platforms through reliable evaluation assessments. In this study, we examined the transferability of a deep learning Mask Region-Based Convolutional Neural Network (R-CNN) model for mapping IWPs in satellite remote sensing imagery (~0.5 m) covering 272 km2 and unmanned aerial vehicle (UAV) (0.02 m) imagery covering 0.32 km2. Multi-spectral images were obtained from the WorldView-2 satellite sensor and pan-sharpened to ~0.5 m, and a 20 mp CMOS sensor camera onboard a UAV, respectively. The training dataset included 25,489 and 6022 manually delineated IWPs from satellite and fixed-wing aircraft aerial imagery near the Arctic Coastal Plain, northern Alaska. Quantitative assessments showed that individual IWPs were correctly detected at up to 72% and 70%, and delineated at up to 73% and 68% F1 score accuracy levels for satellite and UAV images, respectively. Expert-based qualitative assessments showed that IWPs were correctly detected at good (40–60%) and excellent (80–100%) accuracy levels for satellite and UAV images, respectively, and delineated at excellent (80–100%) level for both images. We found that (1) regardless of spatial resolution and spectral bands, the deep learning Mask R-CNN model effectively mapped IWPs in both remote sensing satellite and UAV images; (2) the model achieved a better accuracy in detection with finer image resolution, such as UAV imagery, yet a better accuracy in delineation with coarser image resolution, such as satellite imagery; (3) increasing the number of training data with different resolutions between the training and actual application imagery does not necessarily result in better performance of the Mask R-CNN in IWPs mapping; (4) and overall, the model underestimates the total number of IWPs particularly in terms of disjoint/incomplete IWPs.more » « less
An official website of the United States government

