skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.


Title: Thermal Image Processing via Physics-Inspired Deep Networks
We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep network-based regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target--making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches.  more » « less
Award ID(s):
1911094 1730574 1838177
PAR ID:
10298314
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2nd ICCV Workshop on Learning for Computational Imaging (LCI)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce DeepIR, a new thermal image processing framework that combines physically accurate sensor modeling with deep network-based image representation. Our key enabling observations are that the images captured by thermal sensors can be factored into slowly changing, scene-independent sensor non-uniformities (that can be accurately modeled using physics) and a scene-specific radiance flux (that is well-represented using a deep network-based regularizer). DeepIR requires neither training data nor periodic ground-truth calibration with a known black body target--making it well suited for practical computer vision tasks. We demonstrate the power of going DeepIR by developing new denoising and super-resolution algorithms that exploit multiple images of the scene captured with camera jitter. Simulated and real data experiments demonstrate that DeepIR can perform high-quality non-uniformity correction with as few as three images, achieving a 10dB PSNR improvement over competing approaches. 
    more » « less
  2. We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by training a deep neural network to predict the per-pixel reflectance chromaticity of the scene, which we use in conjunction with a previous flash/no-flash image-based separation algorithm to produce the final two output images. We design our reflectance chromaticity network and loss functions by incorporating intuitions from the physics of image formation. We show that this leads to significantly better performance than other single image techniques and even approaches the quality of the two image separation method. 
    more » « less
  3. 3D LiDAR scanners are playing an increasingly important role in autonomous driving as they can generate depth information of the environment. However, creating large 3D LiDAR point cloud datasets with point-level labels requires a significant amount of manual annotation. This jeopardizes the efficient development of supervised deep learning algorithms which are often data-hungry. We present a framework to rapidly create point clouds with accurate pointlevel labels from a computer game. To our best knowledge, this is the first publication on LiDAR point cloud simulation framework for autonomous driving. The framework supports data collection from both auto-driving scenes and user-configured scenes. Point clouds from auto-driving scenes can be used as training data for deep learning algorithms, while point clouds from user-configured scenes can be used to systematically test the vulnerability of a neural network, and use the falsifying examples to make the neural network more robust through retraining. In addition, the scene images can be captured simultaneously in order for sensor fusion tasks, with a method proposed to do automatic registration between the point clouds and captured scene images. We show a significant improvement in accuracy (+9%) in point cloud segmentation by augmenting the training dataset with the generated synthesized data. Our experiments also show by testing and retraining the network using point clouds from user-configured scenes, the weakness/blind spots of the neural network can be fixed. 
    more » « less
  4. Image stitching involves combining multiple images of the same scene captured from different viewpoints into a single image with an expanded field of view. While this technique has various applications in computer vision, traditional methods rely on the successive stitching of image pairs taken from multiple cameras. While this approach is effective for organized camera arrays, it can pose challenges for unstructured ones, especially when handling scene overlaps. This paper presents a deep learning-based approach for stitching images from large unstructured camera sets covering complex scenes. Our method processes images concurrently by using the SandFall algorithm to transform data from multiple cameras into a reduced fixed array, thereby minimizing data loss. A customized convolutional neural network then processes these data to produce the final image. By stitching images simultaneously, our method avoids the potential cascading errors seen in sequential pairwise stitching while offering improved time efficiency. In addition, we detail an unsupervised training method for the network utilizing metrics from Generative Adversarial Networks supplemented with supervised learning. Our testing revealed that the proposed approach operates in roughly ∼1/7th the time of many traditional methods on both CPU and GPU platforms, achieving results consistent with established methods.

     
    more » « less
  5. This paper presents a novel method for pedestrian detection and tracking by fusing camera and LiDAR sensor data. To deal with the challenges associated with the autonomous driving scenarios, an integrated tracking and detection framework is proposed. The detection phase is performed by converting LiDAR streams to computationally tractable depth images, and then, a deep neural network is developed to identify pedestrian candidates both in RGB and depth images. To provide accurate information, the detection phase is further enhanced by fusing multi-modal sensor information using the Kalman filter. The tracking phase is a combination of the Kalman filter prediction and an optical flow algorithm to track multiple pedestrians in a scene. We evaluate our framework on a real public driving dataset. Experimental results demonstrate that the proposed method achieves significant performance improvement over a baseline method that solely uses image-based pedestrian detection. 
    more » « less