skip to main content


Title: The Spectral Bias of the Deep Image Prior
The "deep image prior" proposed by Ulyanov et al. is an intriguing property of neural nets: a convolutional encoder-decoder network can be used as a prior for natural images. The network architecture implicitly introduces a bias; If we train the model to map white noise to a corrupted image, this bias guides the model to fit the true image before fitting the corrupted regions. This paper explores why the deep image prior helps in denoising natural images. We present a novel method to analyze trajectories generated by the deep image prior optimization and demonstrate: (i) convolution layers of the an encoder-decoder decouple the frequency components of the image, learning each at different rates (ii) the model fits lower frequencies first, making early stopping behave as a low pass filter. The experiments study an extension of Cheng et al which showed that at initialization, the deep image prior is equivalent to a stationary Gaussian process.  more » « less
Award ID(s):
1661259
NSF-PAR ID:
10162414
Author(s) / Creator(s):
Date Published:
Journal Name:
Bayesian Deep Learning Workshop and Advances in Neural Information Processing Systems (NeurIPS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Image restoration and denoising has been a challenging problem in optics and computer vision. There has been active research in the optics and imaging communities to develop a robust, data-efficient system for image restoration tasks. Recently, physics-informed deep learning has received wide interest in scientific problems. In this paper, we introduce a three-dimensional integral imaging-based physics-informed unsupervised CycleGAN algorithm for underwater image descattering and recovery using physics-informed CycleGAN (Generative Adversarial Network). The system consists of a forward and backward pass. The base architecture consists of an encoder and a decoder. The encoder takes the clean image along with the depth map and the degradation parameters to produce the degraded image. The decoder takes the degraded image generated by the encoder along with the depth map and produces the clean image along with the degradation parameters. In order to provide physical significance for the input degradation parameter w.r.t a physical model for the degradation, we also incorporated the physical model into the loss function. The proposed model has been assessed under the dataset curated through underwater experiments at various levels of turbidity. In addition to recovering the original image from the degraded image, the proposed algorithm also helps to model the distribution under which the degraded images have been sampled. Furthermore, the proposed three-dimensional Integral Imaging approach is compared with the traditional deep learning-based approach and 2D imaging approach under turbid and partially occluded environments. The results suggest the proposed approach is promising, especially under the above experimental conditions.

     
    more » « less
  2. Convolutional Neural Networks (CNNs) have emerged as highly successful tools for image generation, recovery, and restoration. A major contributing factor to this success is that convolutional networks impose strong prior assumptions about natural images. A surprising experiment that highlights this architectural bias towards natural images is that one can remove noise and corruptions from a natural image without using any training data, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to the corrupted image. While this over-parameterized network can fit the corrupted image perfectly, surprisingly after a few iterations of gradient descent it generates an almost uncorrupted image. This intriguing phenomenon enables state-of-the-art CNN-based denoising and regularization of other inverse problems. In this paper, we attribute this effect to a particular architectural choice of convolutional networks, namely convolutions with fixed interpolating filters. We then formally characterize the dynamics of fitting a two-layer convolutional generator to a noisy signal and prove that early-stopped gradient descent denoises/regularizes. Our proof relies on showing that convolutional generators fit the structured part of an image significantly faster than the corrupted portion. 
    more » « less
  3. Convolutional Neural Networks (CNNs) have emerged as highly successful tools for image generation, recovery, and restoration. This success is often attributed to large amounts of training data. However, recent experimental findings challenge this view and instead suggest that a major contributing factor to this success is that convolutional networks impose strong prior assumptions about natural images. A surprising experiment that highlights this architectural bias towards natural images is that one can remove noise and corruptions from a natural image without using any training data, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to the single corrupted image. While this over-parameterized network can fit the corrupted image perfectly, surprisingly after a few iterations of gradient descent one obtains the uncorrupted image. This intriguing phenomena enables state-of-the-art CNN-based denoising and regularization of linear inverse problems such as compressive sensing. In this paper we take a step towards demystifying this experimental phenomena by attributing this effect to particular architectural choices of convolutional networks, namely convolutions with fixed interpolating filters. We then formally characterize the dynamics of fitting a two layer convolutional generator to a noisy signal and prove that earlystopped gradient descent denoises/regularizes. This results relies on showing that convolutional generators fit the structured part of an image significantly faster than the corrupted portion 
    more » « less
  4. Osteoarthritis of the knee is increasingly prevalent as our population ages, representing an increasing financial burden, and severely impacting quality of life. The invasiveness of in vivo procedures and the high cost of cadaveric studies has left computational tools uniquely suited to study knee biomechanics. Developments in deep learning have great potential for efficiently generating large-scale datasets to enable researchers to perform population-sized investigations, but the time and effort associated with producing robust hexahedral meshes has been a limiting factor in expanding finite element studies to encompass a population. Here we developed a fully automated pipeline capable of taking magnetic resonance knee images and producing a working finite element simulation. We trained an encoder-decoder convolutional neural network to perform semantic image segmentation on the Imorphics dataset provided through the Osteoarthritis Initiative. The Imorphics dataset contained 176 image sequences with varying levels of cartilage degradation. Starting from an open-source swept-extrusion meshing algorithm, we further developed this algorithm until it could produce high quality meshes for every sequence and we applied a template-mapping procedure to automatically place soft-tissue attachment points. The meshing algorithm produced simulation-ready meshes for all 176 sequences, regardless of the use of provided (manually reconstructed) or predicted (automatically generated) segmentation labels. The average time to mesh all bones and cartilage tissues was less than 2 min per knee on an AMD Ryzen 5600X processor, using a parallel pool of three workers for bone meshing, followed by a pool of four workers meshing the four cartilage tissues. Of the 176 sequences with provided segmentation labels, 86% of the resulting meshes completed a simulated flexion-extension activity. We used a reserved testing dataset of 28 sequences unseen during network training to produce simulations derived from predicted labels. We compared tibiofemoral contact mechanics between manual and automated reconstructions for the 24 pairs of successful finite element simulations from this set, resulting in mean root-mean-squared differences under 20% of their respective min-max norms. In combination with further advancements in deep learning, this framework represents a feasible pipeline to produce population sized finite element studies of the natural knee from subject-specific models. 
    more » « less
  5. In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide stateof-the-art performance in a wide variety of tasks, such as machine translation, headline generation, text summarization, speech-to-text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Although simple encoder–decoder models produce competitive results, many researchers have proposed additional improvements over these seq2seq models, e.g., using an attention-based model over the input, pointer-generation models, and self-attention models. However, such seq2seq models suffer from two common problems: 1) exposure bias and 2) inconsistency between train/test measurement. Recently, a completely novel point of view has emerged in addressing these two problems in seq2seq models, leveraging methods from reinforcement learning (RL). In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with seq2seq models that enable remembering long-term memories. We present some of the most recent frameworks that combine the concepts from RL and deep neural networks. Our work aims to provide insights into some of the problems that inherently arise with current approaches and how we can address them with better RL models. We also provide the source code for implementing most of the RL models discussed in this paper to support the complex task of abstractive text summarization and provide some targeted experiments for these RL models, both in terms of performance and training time. 
    more » « less