Deep neural networks, in particular convolutional neural networks, have become highly effective tools for compressing images and solving inverse problems including denoising, inpainting, and reconstruction from few and noisy measurements. This success can be attributed in part to their ability to represent and generate natural images well. Contrary to classical tools such as wavelets, image-generating deep neural networks have a large number of parameters---typically a multiple of their output dimension---and need to be trained on large datasets. In this paper, we propose an untrained simple image model, called the deep decoder, which is a deep neural network that can generate natural images from very few weight parameters. The deep decoder has a simple architecture with no convolutions and fewer weight parameters than the output dimensionality. This underparameterization enables the deep decoder to compress images into a concise set of network weights, which we show is on par with wavelet-based thresholding. Further, underparameterization provides a barrier to overfitting, allowing the deep decoder to have state-of-the-art performance for denoising. The deep decoder is simple in the sense that each layer has an identical structure that consists of only one upsampling unit, pixel-wise linear combination of channels, ReLU activation, and channelwise normalization. This simplicity makes the network amenable to theoretical analysis, and it sheds light on the aspects of neural networks that enable them to form effective signal representations.
more »
« less
A Bayesian Perspective on the Deep Image Prior
The deep image prior was recently introduced as a prior for natural images. It represents images as the output of a convolutional network with random inputs. For “inference”, gradient descent is performed to adjust network parameters to make the output match observations. This approach yields good performance on a range of image reconstruction tasks. We show that the deep image prior is asymptotically equivalent to a stationary Gaussian process prior in the limit as the number of channels in each layer of the network goes to infinity, and derive the corresponding kernel. This informs a Bayesian approach to inference. We show that by conducting posterior inference using stochastic gradient Langevin dynamics we avoid the need for early stopping, which is a drawback of the current approach, and improve results for denoising and impainting tasks. We illustrate these intuitions on a number of 1D and 2D signal reconstruction tasks.
more »
« less
- Award ID(s):
- 1661259
- PAR ID:
- 10162021
- Date Published:
- Journal Name:
- IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- Page Range / eLocation ID:
- 5438 to 5446
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
3D image reconstruction from a limited number of 2D images has been a long-standing challenge in computer vision and image analysis. While deep learning-based approaches have achieved impressive performance in this area, existing deep networks often fail to effectively utilize the shape structures of objects presented in images. As a result, the topology of reconstructed objects may not be well preserved, leading to the presence of artifacts such as discontinuities, holes, or mismatched connections between different parts. In this paper, we propose a shape-aware network based on diffusion models for 3D image reconstruction, named SADIR, to address these issues. In contrast to previous methods that primarily rely on spatial correlations of image intensities for 3D reconstruction, our model leverages shape priors learned from the training data to guide the reconstruction process. To achieve this, we develop a joint learning network that simultaneously learns a mean shape under deformation models. Each reconstructed image is then considered as a deformed variant of the mean shape. We validate our model, SADIR, on both brain and cardiac magnetic resonance images (MRIs). Experimental results show that our method outperforms the baselines with lower reconstruction error and better preservation of the shape structure of objects within the images.more » « less
-
Recently, Deep Image Prior (DIP) has emerged as an effective unsupervised one-shot learner, delivering competitive results across various image recovery problems. This method only requires the noisy measurements and a forward operator, relying solely on deep networks initialized with random noise to learn and restore the structure of the data. However, DIP is notorious for its vulnerability to overfitting due to the overparameterization of the network. Building upon insights into the impact of the DIP input and drawing inspiration from the gradual denoising process in cutting-edge diffusion models, we introduce Autoencoding Sequential DIP (aSeqDIP) for image reconstruction. This method progressively denoises and reconstructs the image through a sequential optimization of network weights. This is achieved using an input-adaptive DIP objective, combined with an autoencoding regularization term. Compared to diffusion models, our method does not require training data and outperforms other DIP-based methods in mitigating noise overfitting while maintaining a similar number of parameter updates as Vanilla DIP. Through extensive experiments, we validate the effectiveness of our method in various image reconstruction tasks, such as MRI and CT reconstruction, as well as in image restoration tasks like image denoising, inpainting, and non-linear deblurring.more » « less
-
Deep learning based PET image reconstruction methods have achieved promising results recently. However, most of these methods follow a supervised learning paradigm, which rely heavily on the availability of high-quality training labels. In particular, the long scanning time required and high radiation exposure associated with PET scans make obtaining these labels impractical. In this paper, we propose a dual-domain unsupervised PET image reconstruction method based on learned descent algorithm, which reconstructs high-quality PET images from sinograms without the need for image labels. Specifically, we unroll the proximal gradient method with a learnable norm for PET image reconstruction problem. The training is unsupervised, using measurement domain loss based on deep image prior as well as image domain loss based on rotation equivariance property. The experimental results demonstrate the superior performance of proposed method compared with maximum-likelihood expectation-maximization (MLEM), total-variation regularized EM (EM-TV) and deep image prior based method (DIP).more » « less
-
Adversarial images are a class of images that have been slightly altered by very specific noise to change the way a deep learning neural network classifies the image. In many cases, this particular noise is imperceptible to the human vision system and thus presents a vulnerability of significant concern to the machine learning and artificial intelligence community. Research towards mitigating this type of attack has taken many forms, one of which is to filter or post process the image before classifying the image with a deep neural network. Techniques such as smoothing, filtering, and compression have been used with varying levels of success. In our work, we explored the use of a neuromorphic software and hardware approach as a protection against adversarial image attack. The algorithm governing our neuromorphic approach is based upon sparse coding. Our sparse coding approach is solved using a dynamic system of equations that models biological low level vision. Our quantitative and qualitative results show that a sparse coding reconstruction is remarkably invariant to changes in sparsity and reconstruction error with respect to classification accuracy. Furthermore, our approach is able to maintain low reconstruction errors without sacrificing classification performance.more » « less
An official website of the United States government

