The rapidly growing use of imaging infrastructure in the energy materials domain drives significant data accumulation in terms of their amount and complexity. The applications of routine techniques for image processing in materials research are often ad hoc , indiscriminate, and empirical, which renders the crucial task of obtaining reliable metrics for quantifications obscure. Moreover, these techniques are expensive, slow, and often involve several preprocessing steps. This paper presents a novel deep learning-based approach for the high-throughput analysis of the particle size distributions from transmission electron microscopy (TEM) images of carbon-supported catalysts for polymer electrolyte fuel cells. A dataset of 40 high-resolution TEM images at different magnification levels, from 10 to 100 nm scales, was annotated manually. This dataset was used to train the U-Net model, with the StarDist formulation for the loss function, for the nanoparticle segmentation task. StarDist reached a precision of 86%, recall of 85%, and an F1-score of 85% by training on datasets as small as thirty images. The segmentation maps outperform models reported in the literature for a similar problem, and the results on particle size analyses agree well with manual particle size measurements, albeit at a significantly lower cost.
more »
« less
A comparative analysis of YOLOv8 and U-Net image segmentation approaches for transmission electron micrographs of polycrystalline thin films
Metallic thin films offer a platform to experimentally study the dynamics of microstructural evolution, but the required transmission electron microscopy (TEM)-based imaging generates complex images that are challenging to segment and quantify. This work provides a comparative analysis of a new YOLOv8 model and an established U-Net model for bright-field TEM images of polycrystals, employing a framework leveraging physical observables to evaluate performance against two hand-traced benchmark datasets. This methodology obviates the comparison of large, diversely structured, and manually labeled datasets that are required to assess performance on a per-image/per-pixel basis. It is found that the YOLOv8 model, adapted for real-time instance segmentation, has up to 43× faster inferencing (NVIDIA GeForce RTX 4090) compared to U-Net and reconstructs hand-traced grain size distributions (GSDs) with excellent fidelity, finding mean diameter within 3% for grains near an optimal magnification; for grains that deviate from the optimal pixel-diameter, the size of small- (large)-diameter grains is systematically over- (under)-estimated. This is partially mitigated by including scale-aware augmentations during training. Moreover, when the bias is corrected post-inference by a rigid shift in distribution, the YOLOv8 model reproduces ground truth GSDs with exceptional fidelity, with statistical tests indicating <5% probability that the distributions are distinct. Based on ground truth data, calibration curves pertaining to this shift can be constructed for a given model. This issue is not present in the U-Net model’s results, indicating that for quantitative measurements where the true size of objects is of interest, special procedures must be implemented for YOLO-based models.
more »
« less
- Award ID(s):
- 2118206
- PAR ID:
- 10618104
- Publisher / Repository:
- American Institute of Physics
- Date Published:
- Journal Name:
- APL Machine Learning
- Volume:
- 3
- Issue:
- 3
- ISSN:
- 2770-9019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Ultrasound computed tomography (USCT) is one of the advanced imaging techniques used in structural health monitoring (SHM) and medical imaging due to its relatively low-cost, rapid data acquisition process. The time-domain full waveform inversion (TDFWI) method, an iterative inversion approach, has shown great promise in USCT. However, such an iterative process can be very time-consuming and computationally expensive but can be greatly accelerated by integrating an AI-based approach (e.g., convolution neural network (CNN)). Once trained, the CNN model takes low-iteration TDFWI images as input and instantaneously predicts material property distribution within the scanned region. Nevertheless, the quality of the reconstruction with the current CNN degrades with the increased complexity of material distributions. Another challenge is the availability of enough experimental data and, in some cases, even synthetic surrogate data. To alleviate these issues, this paper details a systematic study of the enhancement effect of a 2D CNN (U-Net) by improving the quality with limited training data. To achieve this, different augmentation schemes (flipping and mixing existing data) were implemented to increase the amount and complexity of the training datasets without generating a substantial number of samples. The objective was to evaluate the enhancement effect of these augmentation techniques on the performance of the U-Net model at FWI iterations. A thousand numerically built samples with acoustic material properties are used to construct multiple datasets from different FWI iterations. A parallelized, high-performance computing (HPC) based framework has been created to rapidly generate the training data. The prediction results were compared against the ground truth images using standard matrices, such as the structural similarity index measure (SSIM) and average mean square error (MSE). The results show that the increased number of samples from augmentations improves shape imaging of the complex regions even with a low iteration FWI training data.more » « less
-
Arai, Igor (Ed.)This research explores practical applications of Transfer Learning and Spatial Attention mechanisms using pre-trained models from an open-source simulator, CARLA (Car Learning to Act). The study focuses on vehicle tracking using aerial images, utilizing transformers and graph algorithms for keypoint detection. The proposed detector training process optimizes model parameters without heavy reliance on manually set hyperparameters. The loss function considers both class distribution and position localization of ground truth data. The study utilizes a three-stage methodology: pre-trained model selection, fine-tuning with a custom synthetic dataset, and evaluation using real-world aerial datasets. The results demonstrate the effectiveness of our synthetic transformer-based transfer learning technique in enhancing object detection accuracy and localization. When tested with real-world images, our approach achieved an 88% detection, compared to only 30% when using YOLOv8. The findings underscore the advantages of incorporating graph-based loss functions in transfer learning and position-encoding techniques, demonstrating their effectiveness in realistic machine learning applications with unbalanced classes.more » « less
-
Kohei, Arai (Ed.)This research explores practical applications of Transfer Learning and Spatial Attention mechanisms using pre-trained models from an open-source simulator, CARLA (Car Learning to Act). The study focuses on vehicle tracking using aerial images, utilizing transformers and graph algorithms for keypoint detection. The proposed detector training process optimizes model parameters without heavy reliance on manually set hyperparameters. The loss function considers both class distribution and position localization of ground truth data. The study utilizes a three-stage methodology: pre-trained model selection, fine-tuning with a custom synthetic dataset, and evaluation using real-world aerial datasets. The results demonstrate the effectiveness of our synthetic transformer-based transfer learning technique in enhancing object detection accuracy and localization. When tested with real-world images, our approach achieved an 88% detection, compared to only 30% when using YOLOv8. The findings underscore the advantages of incorporating graph-based loss functions in transfer learning and position-encoding techniques, demonstrating their effectiveness in realistic machine learning applications with unbalanced classes.more » « less
-
Recent advancements in two-photon calcium imaging have enabled scientists to record the activity of thousands of neurons with cellular resolution. This scope of data collection is crucial to understanding the next generation of neuroscience questions, but analyzing these large recordings requires automated methods for neuron segmentation. Supervised methods for neuron segmentation achieve state of-the-art accuracy and speed but currently require large amounts of manually generated ground truth training labels. We reduced the required number of training labels by designing a semi-supervised pipeline. Our pipeline used neural network ensembling to generate pseudolabels to train a single shallow U-Net. We tested our method on three publicly available datasets and compared our performance to three widely used segmentation methods. Our method outperformed other methods when trained on a small number of ground truth labels and could achieve state-of-the-art accuracy after training on approximately a quarter of the number of ground truth labels as supervised methods. When trained on many ground truth labels, our pipeline attained higher accuracy than that of state-of-the-art methods. Overall, our work will help researchers accurately process large neural recordings while minimizing the time and effort needed to generate manual labels.more » « less
An official website of the United States government
