skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Blind Image Inpainting with Sparse Directional Filter Dictionaries for Lightweight CNNs
Blind inpainting algorithms based on deep learning architectures have shown a remarkable performance in recent years, typically outperforming model-based methods both in terms of image quality and run time. However, neural network strategies typically lack a theoretical explanation, which contrasts with the well-understood theory underlying model-based methods. In this work, we leverage the advantages of both approaches by integrating theoretically founded concepts from transform domain methods and sparse approximations into a CNN-based approach for blind image inpainting. To this end, we present a novel strategy to learn convolutional kernels that applies a specifically designed filter dictionary whose elements are linearly combined with trainable weights. Numerical experiments demonstrate the competitiveness of this approach. Our results show not only an improved inpainting quality compared to conventional CNNs but also significantly faster network convergence within a lightweight network design. Our code is available at https://github.com/cv-stuttgart/SDPF_Blind-Inpainting.  more » « less
Award ID(s):
1720487 1720452
PAR ID:
10383527
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Journal of Mathematical Imaging and Vision
ISSN:
0924-9907
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Existing image inpainting methods typically fill holes by borrowing information from surrounding pixels. They often produce unsatisfactory results when the holes overlap with or touch foreground objects due to lack of information about the actual extent of foreground and background regions within the holes. These scenarios, however, are very important in practice, especially for applications such as the removal of distracting objects. To address the problem, we propose a foreground-aware image inpainting system that explicitly disentangles structure inference and content completion. Specifically, our model learns to predict the foreground contour first, and then inpaints the missing region using the predicted contour as guidance. We show that by such disentanglement, the contour completion model predicts reasonable contours of objects, and further substantially improves the performance of image inpainting. Experiments show that our method significantly outperforms existing methods and achieves superior inpainting results on challenging cases with complex compositions. 
    more » « less
  2. Abstract Removing photobombing elements from images is a challenging task that requires sophisticated image inpainting techniques. Despite the availability of various methods, their effectiveness depends on the complexity of the image and the nature of the distracting element. To address this issue, we conducted a benchmark study to evaluate 10 state-of-the-art photobombing removal methods on a dataset of over 300 images. Our study focused on identifying the most effective image inpainting techniques for removing unwanted regions from images. We annotated the photobombed regions that require removal and evaluated the performance of each method using peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Fréchet inception distance (FID). The results show that image inpainting techniques can effectively remove photobombing elements, but more robust and accurate methods are needed to handle various image complexities. Our benchmarking study provides a valuable resource for researchers and practitioners to select the most suitable method for their specific photobombing removal task. 
    more » « less
  3. Image security is becoming an increasingly important issue due to advances in deep learning based image manipulations, such as deep image inpainting and deepfakes. There has been considerable work to date on detecting such image manipulations using improved algorithms, with little attention paid to the possible role that hardware advances may have for improving security. We propose to use a focal stack camera as a novel secure imaging device, to the best of our knowledge, that facilitates localizing modified regions in manipulated images. We show that applying convolutional neural network detection methods to focal stack images achieves significantly better detection accuracy compared to single image based forgery detection. This work demonstrates that focal stack images could be used as a novel secure image file format and opens up a new direction for secure imaging. 
    more » « less
  4. Blindrestoration of low-quality faces in the real world has advanced rapidly in recent years. The rich and diverse priors encapsulated by pre-trained face GAN have demonstrated their effectiveness in reconstructing high-quality faces from low-quality observations in the real world. However, the modeling of degradation in real-world face images remains poorly understood, affecting the property of generalization of existing methods. Inspired by the success of pre-trained models and transformers in recent years, we propose to solve the problem of blind restoration by jointly exploiting their power for degradation and prior learning, respectively. On the one hand, we train a two-generator architecture for degradation learning to transfer the style of low-quality real-world faces to the high-resolution output of pre-trained StyleGAN. On the other hand, we present a hybrid architecture, called Skip-Transformer (ST), which combines transformer encoder modules with a pre-trained StyleGAN-based decoder using skip layers. Such a hybrid design is innovative in that it represents the first attempt to jointly exploit the global attention mechanism of the transformer and pre-trained StyleGAN-based generative facial priors. We have compared our DL-ST model with the latest three benchmarks for blind image restoration (DFDNet, PSFRGAN, and GFP-GAN). Our experimental results have shown that this work outperforms all other competing methods, both subjectively and objectively (as measured by the Fréchet Inception Distance and NIQE metrics). 
    more » « less
  5. Recently, Deep Image Prior (DIP) has emerged as an effective unsupervised one-shot learner, delivering competitive results across various image recovery problems. This method only requires the noisy measurements and a forward operator, relying solely on deep networks initialized with random noise to learn and restore the structure of the data. However, DIP is notorious for its vulnerability to overfitting due to the overparameterization of the network. Building upon insights into the impact of the DIP input and drawing inspiration from the gradual denoising process in cutting-edge diffusion models, we introduce Autoencoding Sequential DIP (aSeqDIP) for image reconstruction. This method progressively denoises and reconstructs the image through a sequential optimization of network weights. This is achieved using an input-adaptive DIP objective, combined with an autoencoding regularization term. Compared to diffusion models, our method does not require training data and outperforms other DIP-based methods in mitigating noise overfitting while maintaining a similar number of parameter updates as Vanilla DIP. Through extensive experiments, we validate the effectiveness of our method in various image reconstruction tasks, such as MRI and CT reconstruction, as well as in image restoration tasks like image denoising, inpainting, and non-linear deblurring. 
    more » « less