skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Long-Range Thermal Target Detection in Data-Limited Settings Using Restricted Receptive Fields
Long-range target detection in thermal infrared imagery is a challenging research problem due to the low resolution and limited detail captured by thermal sensors. The limited size and variability in thermal image datasets for small target detection is also a major constraint for the development of accurate and robust detection algorithms. To address both the sensor and data constraints, we propose a novel convolutional neural network (CNN) feature extraction architecture designed for small object detection in data-limited settings. More specifically, we focus on long-range ground-based thermal vehicle detection, but also show the effectiveness of the proposed algorithm on drone and satellite aerial imagery. The design of the proposed architecture is inspired by an analysis of popular object detectors as well as custom-designed networks. We find that restricted receptive fields (rather than more globalized features, as is the trend), along with less down sampling of feature maps and attenuated processing of fine-grained features, lead to greatly improved detection rates while mitigating the model’s capacity to overfit on small or poorly varied datasets. Our approach achieves state-of-the-art results on the Defense Systems Information Analysis Center (DSIAC) automated target recognition (ATR) and the Tiny Object Detection in Aerial Images (AI-TOD) datasets.  more » « less
Award ID(s):
1650474
PAR ID:
10496367
Author(s) / Creator(s):
; ;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Sensors
Volume:
23
Issue:
18
ISSN:
1424-8220
Page Range / eLocation ID:
7806
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. While archaeologists have long understood that thermal and multi-spectral imagery can potentially reveal a wide range of ancient cultural landscape features, only recently have advances in drone and sensor technology enabled us to collect these data at sufficiently high spatial and temporal resolution for archaeological field settings. This paper presents results of a study at the Enfield Shaker Village, New Hampshire (USA), in which we collect a time-series of multi-spectral visible light, near-infrared (NIR), and thermal imagery in order to better understand the optimal contexts and environmental conditions for various sensors. We present new methods to remove noise from imagery and to combine multiple raster datasets in order to improve archaeological feature visibility. Analysis compares results of aerial imaging with ground-penetrating radar and magnetic gradiometry surveys, illustrating the complementary nature of these distinct remote sensing methods. Results demonstrate the value of high-resolution thermal and NIR imagery, as well as of multi-temporal image analysis, for the detection of archaeological features on and below the ground surface, offering an improved set of methods for the integration of these emerging technologies into archaeological field investigations. 
    more » « less
  2. null (Ed.)
    While archaeologists have long understood that thermal and multi-spectral imagery can potentially reveal a wide range of ancient cultural landscape features, only recently have advances in drone and sensor technology enabled us to collect these data at sufficiently high spatial and temporal resolution for archaeological field settings. This paper presents results of a study at the Enfield Shaker Village, New Hampshire (USA), in which we collect a time-series of multi-spectral visible light, near-infrared (NIR), and thermal imagery in order to better understand the optimal contexts and environmental conditions for various sensors. We present new methods to remove noise from imagery and to combine multiple raster datasets in order to improve archaeological feature visibility. Analysis compares results of aerial imaging with ground-penetrating radar and magnetic gradiometry surveys, illustrating the complementary nature of these distinct remote sensing methods. Results demonstrate the value of high-resolution thermal and NIR imagery, as well as of multi-temporal image analysis, for the detection of archaeological features on and below the ground surface, offering an improved set of methods for the integration of these emerging technologies into archaeological field investigations 
    more » « less
  3. Detecting objects in aerial images is challenging for at least two reasons: (1) target objects like pedestrians are very small in pixels, making them hardly distinguished from surrounding background; and (2) targets are in general sparsely and non-uniformly distributed, making the detection very inefficient. In this paper, we address both issues inspired by observing that these targets are often clustered. In particular, we propose a Clustered Detection (ClusDet) network that unifies object clustering and detection in an end-to-end framework. The key components in ClusDet include a cluster proposal sub-network (CPNet), a scale estimation sub-network (ScaleNet), and a dedicated detection network (DetecNet). Given an input image, CPNet produces object cluster regions and ScaleNet estimates object scales for these regions. Then, each scale-normalized cluster region and their features are fed into DetecNet for object detection. ClusDet has several advantages over previous solutions: (1) it greatly reduces the number of chips for final object detection and hence achieves high running time efficiency, (2) the cluster-based scale estimation is more accurate than previously used single-object based ones, hence effectively improves the detection for small objects, and (3) the final DetecNet is dedicated for clustered regions and implicitly models the prior context information so as to boost detection accuracy. The proposed method is tested on three popular aerial image datasets including VisDrone, UAVDT and DOTA. In all experiments, ClusDet achieves promising performance in comparison with state-of-the-art detectors 
    more » « less
  4. Despite significant strides in achieving vehicle autonomy, robust perception under low-light conditions still remains a persistent challenge. In this study, we investigate the potential of multispectral imaging, thereby leveraging deep learning models to enhance object detection performance in the context of nighttime driving. Features encoded from the red, green, and blue (RGB) visual spectrum and thermal infrared images are combined to implement a multispectral object detection model. This has proven to be more effective compared to using visual channels only, as thermal images provide complementary information when discriminating objects in low-illumination conditions. Additionally, there is a lack of studies on effectively fusing these two modalities for optimal object detection performance. In this work, we present a framework based on the Faster R-CNN architecture with a feature pyramid network. Moreover, we design various fusion approaches using concatenation and addition operators at varying stages of the network to analyze their impact on object detection performance. Our experimental results on the KAIST and FLIR datasets show that our framework outperforms the baseline experiments of the unimodal input source and the existing multispectral object detectors 
    more » « less
  5. When performing remote sensing image segmentation, practitioners often encounter various challenges, such as a strong imbalance in the foreground–background, the presence of tiny objects, high object density, intra-class heterogeneity, and inter-class homogeneity. To overcome these challenges, this paper introduces AerialFormer, a hybrid model that strategically combines the strengths of Transformers and Convolutional Neural Networks (CNNs). AerialFormer features a CNN Stem module integrated to preserve low-level and high-resolution features, enhancing the model’s capability to process details of aerial imagery. The proposed AerialFormer is designed with a hierarchical structure, in which a Transformer encoder generates multi-scale features and a multi-dilated CNN (MDC) decoder aggregates the information from the multi-scale inputs. As a result, information is taken into account in both local and global contexts, so that powerful representations and high-resolution segmentation can be achieved. The proposed AerialFormer was benchmarked on three benchmark datasets, including iSAID, LoveDA, and Potsdam. Comprehensive experiments and extensive ablation studies show that the proposed AerialFormer remarkably outperforms state-of-the-art methods. 
    more » « less