skip to main content


Title: Mapping Electric Transmission Line Infrastructure from Aerial Imagery with Deep Learning
Access to electricity positively correlates with many beneficial socioeconomic outcomes in the developing world including improvements in education, health, and poverty. Efficient planning for electricity access requires information on the location of existing electric transmission and distribution infrastructure; however, the data on existing infrastructure is often unavailable or expensive. We propose a deep learning based method to automatically detect electric transmission infrastructure from aerial imagery and quantify those results with traditional object detection performance metrics. In addition, we explore two challenges to applying these techniques at scale: (1) how models trained on particular geographies generalize to other locations and (2) how the spatial resolution of imagery impacts infrastructure detection accuracy. Our approach results in object detection performance with an F1 score of 0.53 (0.47 precision and 0.60 recall). Using training data that includes more diverse geographies improves performance across the 4 geographies that we examined. Image resolution significantly impacts object detection performance and decreases precipitously as the image resolution decreases.  more » « less
Award ID(s):
1937137
NSF-PAR ID:
10296726
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
IGARSS 2020-2020 IEEE International Geoscience and Remote Sensing Symposium
Page Range / eLocation ID:
2229 to 2232
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Rapid global warming is catalyzing widespread permafrost degradation in the Arctic, leading to destructive land-surface subsidence that destabilizes and deforms the ground. Consequently, human-built infrastructure constructed upon permafrost is currently at major risk of structural failure. Risk assessment frameworks that attempt to study this issue assume that precise information on the location and extent of infrastructure is known. However, complete, high-quality, uniform geospatial datasets of built infrastructure that are readily available for such scientific studies are lacking. While imagery-enabled mapping can fill this knowledge gap, the small size of individual structures and vast geographical extent of the Arctic necessitate large volumes of very high spatial resolution remote sensing imagery. Transforming this ‘big’ imagery data into ‘science-ready’ information demands highly automated image analysis pipelines driven by advanced computer vision algorithms. Despite this, previous fine resolution studies have been limited to manual digitization of features on locally confined scales. Therefore, this exploratory study serves as the first investigation into fully automated analysis of sub-meter spatial resolution satellite imagery for automated detection of Arctic built infrastructure. We tasked the U-Net, a deep learning-based semantic segmentation model, with classifying different infrastructure types (residential, commercial, public, and industrial buildings, as well as roads) from commercial satellite imagery of Utqiagvik and Prudhoe Bay, Alaska. We also conducted a systematic experiment to understand how image augmentation can impact model performance when labeled training data is limited. When optimal augmentation methods were applied, the U-Net achieved an average F1 score of 0.83. Overall, our experimental findings show that the U-Net-based workflow is a promising method for automated Arctic built infrastructure detection that, combined with existing optimized workflows, such as MAPLE, could be expanded to map a multitude of infrastructure types spanning the pan-Arctic.

     
    more » « less
  2. 3D object detection is an essential task in autonomous driving. Recent techniques excel with highly accurate detection rates, provided the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies---a gap that is commonly attributed to poor image-based depth estimation. However, in this paper we argue that it is not the quality of the data but its representation that accounts for the majority of the difference. Taking the inner workings of convolutional neural networks into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations---essentially mimicking the LiDAR signal. With this representation we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach achieves impressive improvements over the existing state-of-the-art in image-based performance---raising the detection accuracy of objects within the 30m range from the previous state-of-the-art of 22% to an unprecedented 74%. At the time of submission our algorithm holds the highest entry on the KITTI 3D object detection leaderboard for stereo-image-based approaches. 
    more » « less
  3. Traffic intersections are prime locations for deployment of infrastructure sensors and edge computing nodes to realize the vision of a smart city. It is expected that the needs of a smart city, in regards to traffic and pedestrian traffic systems monitored by cameras/video, can be met by using stateof-the-art artificial-intelligence (AI) based object detectors and trackers. A critical component in designing an effective real-time object detection/tracking pipeline is the understanding of how object density, i.e., the number of objects in a scene, and imageresolution and frame rate influence the performance metrics. This study explores the accuracy and speed metrics with the goal of supporting pipelines that meet the precision and latency needs of a real-time environment. We examine the impact of varying image-resolution, frame rate and object-density on the object detection performance metrics. The experiments on the COSMOS testbed dataset show that varying the frame width from 416 pixels to 832 pixels, and cropping the images to a square resolution, result in the increase in average precision for all object classes. Decreasing the frame rate from 15 fps to 5 fps preserves more than 90% of the highest F1 score achieved for all object classes. The results inform the choice of video preprocessing stages, modifications to established AI-based object detection/tracking methods, and suggest optimal hyper-parameter values. Index Terms—Object Detection, Smart City, Video Resolution, Deep Learning Models. 
    more » « less
  4. Long-range target detection in thermal infrared imagery is a challenging research problem due to the low resolution and limited detail captured by thermal sensors. The limited size and variability in thermal image datasets for small target detection is also a major constraint for the development of accurate and robust detection algorithms. To address both the sensor and data constraints, we propose a novel convolutional neural network (CNN) feature extraction architecture designed for small object detection in data-limited settings. More specifically, we focus on long-range ground-based thermal vehicle detection, but also show the effectiveness of the proposed algorithm on drone and satellite aerial imagery. The design of the proposed architecture is inspired by an analysis of popular object detectors as well as custom-designed networks. We find that restricted receptive fields (rather than more globalized features, as is the trend), along with less down sampling of feature maps and attenuated processing of fine-grained features, lead to greatly improved detection rates while mitigating the model’s capacity to overfit on small or poorly varied datasets. Our approach achieves state-of-the-art results on the Defense Systems Information Analysis Center (DSIAC) automated target recognition (ATR) and the Tiny Object Detection in Aerial Images (AI-TOD) datasets. 
    more » « less
  5. Automatic video analysis tools are an indispensable component in imaging applications. Object detection, the first and the most important step for automatic video analysis, is implemented in many embedded cameras. The accuracy of object detection relies on the quality of images that are processed. This paper proposes a new image quality model for predicting the performance of object detection on embedded cameras. A video data set is constructed that considers different factors for quality degradation in the imaging process, such as reduced resolution, noise, and blur. The performances of commonly used low-complexity object detection algorithms are obtained for the data set. A no-reference regression model based on a bagging ensemble of regression trees is built to predict the accuracy of object detection using observable features in an image. Experimental results show that the proposed model provides more accurate predictions of image quality for object detection than commonly known image quality measures. 
    more » « less