As solar photovoltaic (PV) has emerged as a dominant player in the energy market, there has been an exponential surge in solar deployment and investment within this sector. With the rapid growth of solar energy adoption, accurate and efficient detection of PV panels has become crucial for effective solar energy mapping and planning. This paper presents the application of the Mask2Former model for segmenting PV panels from a diverse, multi-resolution dataset of satellite and aerial imagery. Our primary objective is to harness Mask2Former’s deep learning capabilities to achieve precise segmentation of PV panels in real-world scenarios. We fine-tune the pre-existing Mask2Former model on a carefully curated multi-resolution dataset and a crowdsourced dataset of satellite and aerial images, showcasing its superiority over other deep learning models like U-Net and DeepLabv3+. Most notably, Mask2Former establishes a new state-of-the-art in semantic segmentation by achieving over 95% IoU scores. Our research contributes significantly to the advancement solar energy mapping and sets a benchmark for future studies in this field.
more »
« less
Automated Defect Detection and Localization in Photovoltaic Cells Using Semantic Segmentation of Electroluminescence Images
In this article, we propose a deep learning based semantic segmentation model that identifies and segments defects in electroluminescence (EL) images of silicon photovoltaic (PV) cells. The proposed model can differentiate between cracks, contact interruptions, cell interconnect failures, and contact corrosion for both multicrystalline and monocrystalline silicon cells. Our model utilizes a segmentation Deeplabv3 model with a ResNet-50 backbone. It was trained on 17,064 EL images including 256 physically realistic simulated images of PV cells generated to deal with class imbalance. While performing semantic segmentation for five defect classes, this model achieves a weighted F1-score of 0.95, an unweighted F1-score of 0.69, a pixel-level global accuracy of 95.4%, and a mean intersection over union score of 57.3%. In addition, we introduce the UCF EL Defect dataset, a large-scale dataset consisting of 17,064 EL images, which will be publicly available for use by the PV and computer vision research communities.
more »
« less
- Award ID(s):
- 2050731
- PAR ID:
- 10539719
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Journal of Photovoltaics
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2156-3381
- Page Range / eLocation ID:
- 53 to 61
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The rapidly growing use of imaging infrastructure in the energy materials domain drives significant data accumulation in terms of their amount and complexity. The applications of routine techniques for image processing in materials research are often ad hoc , indiscriminate, and empirical, which renders the crucial task of obtaining reliable metrics for quantifications obscure. Moreover, these techniques are expensive, slow, and often involve several preprocessing steps. This paper presents a novel deep learning-based approach for the high-throughput analysis of the particle size distributions from transmission electron microscopy (TEM) images of carbon-supported catalysts for polymer electrolyte fuel cells. A dataset of 40 high-resolution TEM images at different magnification levels, from 10 to 100 nm scales, was annotated manually. This dataset was used to train the U-Net model, with the StarDist formulation for the loss function, for the nanoparticle segmentation task. StarDist reached a precision of 86%, recall of 85%, and an F1-score of 85% by training on datasets as small as thirty images. The segmentation maps outperform models reported in the literature for a similar problem, and the results on particle size analyses agree well with manual particle size measurements, albeit at a significantly lower cost.more » « less
-
Scanning electron microscopy (SEM) techniques have been extensively performed to image and study bacterial cells with high-resolution images. Bacterial image segmentation in SEM images is an essential task to distinguish an object of interest and its specific region. These segmentation results can then be used to retrieve quantitative measures (e.g., cell length, area, cell density) for the accurate decision-making process of obtaining cellular objects. However, the complexity of the bacterial segmentation task is a barrier, as the intensity and texture of foreground and background are similar, and also, most clustered bacterial cells in images are partially overlapping with each other. The traditional approaches for identifying cell regions in microscopy images are labor intensive and heavily dependent on the professional knowledge of researchers. To mitigate the aforementioned challenges, in this study, we tested a U-Net-based semantic segmentation architecture followed by a post-processing step of morphological over-segmentation resolution to achieve accurate cell segmentation of SEM-acquired images of bacterial cells grown in a rotary culture system. The approach showed an 89.52% Dice similarity score on bacterial cell segmentation with lower segmentation error rates, validated over several cell overlapping object segmentation approaches with significant performance improvement.more » « less
-
null (Ed.)Fine-scale sea ice conditions are key to our efforts to understand and model climate change. We propose the first deep learning pipeline to extract fine-scale sea ice layers from high-resolution satellite imagery (Worldview-3). Extracting sea ice from imagery is often challenging due to the potentially complex texture from older ice floes (i.e., floating chunks of sea ice) and surrounding slush ice, making ice floes less distinctive from the surrounding water. We propose a pipeline using a U-Net variant with a Resnet encoder to retrieve ice floe pixel masks from very-high-resolution multispectral satellite imagery. Even with a modest-sized hand-labeled training set and the most basic hyperparameter choices, our CNN-based approach attains an out-of-sample F1 score of 0.698–a nearly 60% improvement when compared to a watershed segmentation baseline. We then supplement our training set with a much larger sample of images weak-labeled by a watershed segmentation algorithm. To ensure watershed derived pack-ice masks were a good representation of the underlying images, we created a synthetic version for each weak-labeled image, where areas outside the mask are replaced by open water scenery. Adding our synthetic image dataset, obtained at minimal effort when compared with hand-labeling, further improves the out-of-sample F1 score to 0.734. Finally, we use an ensemble of four test metrics and evaluated after mosaicing outputs for entire scenes to mimic production setting during model selection, reaching an out-of-sample F1 score of 0.753. Our fully-automated pipeline is capable of detecting, monitoring, and segmenting ice floes at a very fine level of detail, and provides a roadmap for other use-cases where partial results can be obtained with threshold-based methods but a context-robust segmentation pipeline is desired.more » « less
-
null (Ed.)Datasets of documents in Arabic are urgently needed to promote computer vision and natural language processing research that addresses the specifics of the language. Unfortunately, publicly available Arabic datasets are limited in size and restricted to certain document domains. This paper presents the release of BE-Arabic-9K, a dataset of more than 9000 high-quality scanned images from over 700 Arabic books. Among these, 1500 images have been manually segmented into regions and labeled by their functionality. BE-Arabic-9K includes book pages with a wide variety of complex layouts and page contents, making it suitable for various document layout analysis and text recognition research tasks. The paper also presents a page layout segmentation and text extraction baseline model based on fine-tuned Faster R-CNN structure (FFRA). This baseline model yields cross-validation results with an average accuracy of 99.4% and F1 score of 99.1% for text versus non-text block classification on 1500 annotated images of BE-Arabic-9K. These results are remarkably better than those of the state-of-the-art Arabic book page segmentation system ECDP. FFRA also outperforms three other prior systems when tested on a competition benchmark dataset, making it an outstanding baseline model to challenge.more » « less
An official website of the United States government

