skip to main content

Title: Detecting Cracks and Spalling Automatically in Extreme Events by End-to-end Deep Learning Frameworks
In this paper, we develop and implement end-to-end deep learning approaches to automatically detect two important types of structural failures, cracks and spalling, of buildings and bridges in extreme events such as major earthquakes. A total of 2,229 images were annotated, and are used to train and validate three newly developed Mask Regional Convolutional Neural Networks (Mask R-CNNs). In addition, three sets of public images for different disasters were used to test the accuracy of these models. For detecting and marking these two types of structural failures, one of proposed methods can achieve an accuracy of 67.6% and 81.1%, respectively, on low- and high-resolution images collected from field investigations. The results demonstrate that it is feasible to use the proposed end-to-end method for automatically locating and segmenting the damage using 2D images which can help human experts in cases of disasters.  more » « less
Award ID(s):
Author(s) / Creator(s):
Date Published:
Journal Name:
ISPRS Annals of Photogrammetry and Remote Sensing Spatial Information Science, XXIV ISPRS Congress, International Society for Photogrammetry and Remote Sensing
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Robust Mask R-CNN (Mask Regional Convolutional Neural Network) methods are proposed and tested for automatic detection of cracks on structures or their components that may be damaged during extreme events, such as earthquakes. We curated a new dataset with 2,021 labeled images for training and validation and aimed to find end-to-end deep neural networks for crack detection in the field. With data augmentation and parameters fine-tuning, Path Aggregation Network (PANet) with spatial attention mechanisms and High- resolution Network (HRNet) are introduced into Mask R-CNNs. The tests on three public datasets with low- or high-resolution images demonstrate that the proposed methods can achieve a big improvement over alternative networks, so the proposed method may be sufficient for crack detection for a variety of scales in real applications. 
    more » « less
  2. Pavement surveying and distress mapping is completed by roadway authorities to quantify the topical and structural damage levels for strategic preventative or rehabilitative action. The failure to time the preventative or rehabilitative action and control distress propagation can lead to severe structural and financial loss of the asset requiring complete reconstruction. Continuous and computer-aided surveying measures not only can eliminate human error when analyzing, identifying, defining, and mapping pavement surface distresses, but also can provide a database of road damage patterns and their locations. The database can be used for timely road repairs to gain the maximum durability of the asphalt and the minimum cost of maintenance. This paper introduces an autonomous surveying scheme to collect, analyze, and map the image-based distress data in real time. A descriptive approach is considered for identifying cracks from collected images using a convolutional neural network (CNN) that classifies several types of cracks. Typically, CNN-based schemes require a relatively large processing power to detect desired objects in images in real time. However, the portability objective of this work requires to utilize low-weight processing units. To that end, the CNN training was optimized by the Bayesian optimization algorithm (BOA) to achieve the maximum accuracy and minimum processing time with minimum neural network layers. First, a database consisting of a diverse population of crack distress types such as longitudinal, transverse, and alligator cracks, photographed at multiple angles, was prepared. Then, the database was used to train a CNN whose hyperparameters were optimized using BOA. Finally, a heuristic algorithm is introduced to process the CNN’s output and produce the crack map. The performance of the classifier and mapping algorithm is examined against still images and videos captured by a drone from cracked pavement. In both instances, the proposed CNN was able to classify the cracks with 97% accuracy. The mapping algorithm is able to map a diverse population of surface cracks patterns in real time at the speed of 11.1 km per hour. 
    more » « less
  3. Flooding is one of the leading threats of natural disasters to human life and property, especially in densely populated urban areas. Rapid and precise extraction of the flooded areas is key to supporting emergency-response planning and providing damage assessment in both spatial and temporal measurements. Unmanned Aerial Vehicles (UAV) technology has recently been recognized as an efficient photogrammetry data acquisition platform to quickly deliver high-resolution imagery because of its cost-effectiveness, ability to fly at lower altitudes, and ability to enter a hazardous area. Different image classification methods including SVM (Support Vector Machine) have been used for flood extent mapping. In recent years, there has been a significant improvement in remote sensing image classification using Convolutional Neural Networks (CNNs). CNNs have demonstrated excellent performance on various tasks including image classification, feature extraction, and segmentation. CNNs can learn features automatically from large datasets through the organization of multi-layers of neurons and have the ability to implement nonlinear decision functions. This study investigates the potential of CNN approaches to extract flooded areas from UAV imagery. A VGG-based fully convolutional network (FCN-16s) was used in this research. The model was fine-tuned and a k-fold cross-validation was applied to estimate the performance of the model on the new UAV imagery dataset. This approach allowed FCN-16s to be trained on the datasets that contained only one hundred training samples, and resulted in a highly accurate classification. Confusion matrix was calculated to estimate the accuracy of the proposed method. The image segmentation results obtained from FCN-16s were compared from the results obtained from FCN-8s, FCN-32s and SVMs. Experimental results showed that the FCNs could extract flooded areas precisely from UAV images compared to the traditional classifiers such as SVMs. The classification accuracy achieved by FCN-16s, FCN-8s, FCN-32s, and SVM for the water class was 97.52%, 97.8%, 94.20% and 89%, respectively. 
    more » « less
  4. Abstract

    Mountain meadows are an essential part of the alpine–subalpine ecosystem; they provide ecosystem services like pollination and are home to diverse plant communities. Changes in climate affect meadow ecology on multiple levels, for example, by altering growing season dynamics. Tracking the effects of climate change on meadow diversity through the impacts on individual species and overall growing season dynamics is critical to conservation efforts. Here, we explore how to combine crowd‐sourced camera images with machine learning to quantify flowering species richness across a range of elevations in alpine meadows located in Mt. Rainier National Park, Washington, USA. We employed three machine‐learning techniques (Mask R‐CNN, RetinaNet and YOLOv5) to detect wildflower species in images taken during two flowering seasons. We demonstrate that deep learning techniques can detect multiple species, providing information on flowering richness in photographed meadows. The results indicate higher richness just above the tree line for most of the species, which is comparable with patterns found using field studies. We found that the two‐stage detector Mask R‐CNN was more accurate than single‐stage detectors like RetinaNet and YOLO, with the Mask R‐CNN network performing best overall with mean average precision (mAP) of 0.67 followed by RetinaNet (0.5) and YOLO (0.4). We found that across the methods using anchor box variations in multiples of 16 led to enhanced accuracy. We also show that detection is possible even when pictures are interspersed with complex backgrounds and are not in focus. We found differential detection rates depending on species abundance, with additional challenges related to similarity in flower characteristics, labeling errors and occlusion issues. Despite these potential biases and limitations in capturing flowering abundance and location‐specific quantification, accuracy was notable considering the complexity of flower types and picture angles in this dataset. We, therefore, expect that this approach can be used to address many ecological questions that benefit from automated flower detection, including studies of flowering phenology and floral resources, and that this approach can, therefore, complement a wide range of ecological approaches (e.g., field observations, experiments, community science, etc.). In all, our study suggests that ecological metrics like floral richness can be efficiently monitored by combining machine learning with easily accessible publicly curated datasets (e.g., Flickr, iNaturalist).

    more » « less
  5. The past decade has witnessed the rising dominance of deep learning and artificial intelligence in a wide range of applications. In particular, the ocean of wireless smartphones and IoT devices continue to fuel the tremendous growth of edge/cloudbased machine learning (ML) systems including image/speech recognition and classification. To overcome the infrastructural barrier of limited network bandwidth in cloud ML, existing solutions have mainly relied on traditional compression codecs such as JPEG that were historically engineered for humanend users instead of ML algorithms. Traditional codecs do not necessarily preserve features important to ML algorithms under limited bandwidth, leading to potentially inferior performance. This work investigates application-driven optimization of programmable commercial codec settings for networked learning tasks such as image classification. Based on the foundation of variational autoencoders (VAEs), we develop an end-to-end networked learning framework by jointly optimizing the codec and classifier without reconstructing images for given data rate (bandwidth). Compared with standard JPEG codec, the proposed VAE joint compression and classification framework achieves classification accuracy improvement by over 10% and 4%, respectively, for CIFAR-10 and ImageNet-1k data sets at data rate of 0.8 bpp. Our proposed VAE-based models show 65%􀀀99% reductions in encoder size,  1.5􀀀 13.1 improvements in inference speed and 25%􀀀99% savings in power compared to baseline models. We further show that a simple decoder can reconstruct images with sufficient quality without compromising classification accuracy. 
    more » « less