skip to main content


Title: On Matching Visible to Passive Infrared Face Images Using Image Synthesis & Denoising
Performing a direct match between images from different spectra (i.e., passive infrared and visible) is challenging because each spectrum contains different information pertaining to the subject’s face. In this work, we investigate the benefits and limitations of using synthesized visible face images from thermal ones and vice versa in cross-spectral face recognition systems. For this purpose, we propose utilizing canonical correlation analysis (CCA) and manifold learning dimensionality reduction (LLE). There are four primary contributions of this work. First, we formulate the cross-spectral heterogeneous face matching problem (visible to passive IR) using an image synthesis framework. Second, a new processed database composed of two datasets consistent of separate controlled frontal face subsets (VIS-MWIR and VIS-LWIR) is generated from the original, raw face datasets collected in three different bands (visible, MWIR and LWIR). This multi-band database is constructed using three different methods for preprocessing face images before feature extraction methods are applied. There are: (1) face detection, (2) CSU’s geometric normalization, and (3) our recommended geometric normalization method. Third, a post-synthesis image denoising methodology is applied, which helps alleviate different noise patterns present in synthesized images and improve baseline FR accuracy (i.e. before image synthesis and denoising is applied) in practical heterogeneous FR scenarios. Finally, an extensive experimental study is performed to demonstrate the feasibility and benefits of cross-spectral matching when using our image synthesis and denoising approach. Our results are also compared to a baseline commercial matcher and various academic matchers provided by the CSU’s Face Identification Evaluation System.  more » « less
Award ID(s):
1650474 1066197
NSF-PAR ID:
10053525
Author(s) / Creator(s):
;
Date Published:
Journal Name:
12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017)
Page Range / eLocation ID:
904 to 911
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we propose a convolutional neural network (CNN) based, scenario-dependent and sensor (mobile device) adaptable hierarchical classification framework. Our proposed framework is designed to automatically categorize face data captured under various challenging conditions, before the FR algorithms (pre-processing, feature extraction and matching) are used. First, a unique multi-sensor database (using Samsung S4 Zoom, Nokia 1020, iPhone 5S and Samsung S5 phones) is collected containing face images indoors, outdoors, with yaw angle from -90 to +90 and at two different distances, i.e. 1 and 10 meters. To cope with pose variations, face detection and pose estimation algorithms are used for classifying the facial images into a frontal or a non-frontal class. Next, our proposed framework is used where tri-level hierarchical classification is performed as follows: Level 1, face images are classified based on phone type; Level 2, face images are further classified into indoor and outdoor images; and finally, Level 3 face images are classified into a close (1m) and a far, low quality, (10m) distance categories respectively. Experimental results show that classification accuracy is scenario dependent, reaching from 95 to more than 98% accuracy for level 2 and from 90 to more than 99% for level 3 classification. A set of experiments is performed indicating that, the usage of data grouping before the face matching is performed, resulted in a significantly improved rank-1 identification rate when compared to the original (all vs. all) biometric system. 
    more » « less
  2. We design and characterize a novel axilens-based diffractive optics platform that flexibly combines efficient point focusing and grating selectivity and is compatible with scalable top-down fabrication based on a four-level phase mask configuration. This is achieved using phase-modulated compact axilens devices that simultaneously focus incident radiation of selected wavelengths at predefined locations with larger focal depths compared with traditional Fresnel lenses. In addition, the proposed devices are polarization-insensitive and maintain a large focusing efficiency over a broad spectral band. Specifically, here we discuss and characterize modulated axilens configurations designed for long-wavelength infrared (LWIR) in the 6 µm–12 µm wavelength range and in the 4 µm–6 µm midwavelength infrared (MWIR) range. These devices are ideally suited for monolithic integration atop the substrate layers of infrared focal plane arrays and for use as compact microspectrometers. We systematically study their focusing efficiency, spectral response, and cross-talk ratio; further, we demonstrate linear control of multiwavelength focusing on a single plane. Our design method leverages Rayleigh–Sommerfeld diffraction theory and is validated numerically using the finite element method. Finally, we demonstrate the application of spatially modulated axilenses to the realization of a compact, single-lens spectrometer. By optimizing our devices, we achieve a minimum distinguishable wavelength interval ofΔ<#comment/>λ<#comment/>=240nmatλ<#comment/>c=8µ<#comment/>mandΔ<#comment/>λ<#comment/>=165nmatλ<#comment/>c=5µ<#comment/>m. The proposed devices add fundamental spectroscopic capabilities to compact imaging devices for a number of applications ranging from spectral sorting to LWIR and MWIR phase contrast imaging and detection.

     
    more » « less
  3. With benefits of fast query speed and low storage cost, hashing-based image retrieval approaches have garnered considerable attention from the research community. In this paper, we propose a novel Error-Corrected Deep Cross Modal Hashing (CMH-ECC) method which uses a bitmap specifying the presence of certain facial attributes as an input query to retrieve relevant face images from the database. In this architecture, we generate compact hash codes using an end-to-end deep learning module, which effectively captures the inherent relationships between the face and attribute modality. We also integrate our deep learning module with forward error correction codes to further reduce the distance between different modalities of the same subject. Specifically, the properties of deep hashing and forward error correction codes are exploited to design a cross modal hashing framework with high retrieval performance. Experimental results using two standard datasets with facial attributes-image modalities indicate that our CMH-ECC face image retrieval model outperforms most of the current attribute-based face image retrieval approaches. 
    more » « less
  4. Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. To study this, we incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients. We instantiate this model using convolutional neural networks (CNNs) with local receptive fields, which enforce both the stationarity and Markov properties. Global structures are captured using a CNN with receptive fields covering the entire (but small) low-pass image. We test this model on a dataset of face images, which are highly non-stationary and contain large-scale geometric structures. Remarkably, denoising, super-resolution, and image synthesis results all demonstrate that these structures can be captured with significantly smaller conditioning neighborhoods than required by a Markov model implemented in the pixel domain. Our results show that score estimation for large complex images can be reduced to low-dimensional Markov conditional models across scales, alleviating the curse of dimensionality. 
    more » « less
  5. Objective

    This study investigates a locally low-rank (LLR) denoising algorithm applied to source images from a clinical task-based functional MRI (fMRI) exam before post-processing for improving statistical confidence of task-based activation maps.

    Methods

    Task-based motor and language fMRI was obtained in eleven healthy volunteers under an IRB approved protocol. LLR denoising was then applied to raw complex-valued image data before fMRI processing. Activation maps generated from conventional non-denoised (control) data were compared with maps derived from LLR-denoised image data. Four board-certified neuroradiologists completed consensus assessment of activation maps; region-specific and aggregate motor and language consensus thresholds were then compared with nonparametric statistical tests. Additional evaluation included retrospective truncation of exam data without and with LLR denoising; a ROI-based analysis tracked t-statistics and temporal SNR (tSNR) as scan durations decreased. A test-retest assessment was performed; retest data were matched with initial test data and compared for one subject.

    Results

    fMRI activation maps generated from LLR-denoised data predominantly exhibited statistically significant ( p = 4.88×10–4to p = 0.042; one p = 0.062) increases in consensus t-statistic thresholds for motor and language activation maps. Following data truncation, LLR data showed task-specific increases in t-statistics and tSNR respectively exceeding 20 and 50% compared to control. LLR denoising enabled truncation of exam durations while preserving cluster volumes at fixed thresholds. Test-retest showed variable activation with LLR data thresholded higher in matching initial test data.

    Conclusion

    LLR denoising affords robust increases in t-statistics on fMRI activation maps compared to routine processing, and offers potential for reduced scan duration while preserving map quality.

     
    more » « less