skip to main content

Title: Momentum-Net: Fast and convergent iterative neural network for inverse problems
Iterative neural networks (INN) are rapidly gaining attention for solving inverse problems in imaging, image processing, and computer vision. INNs combine regression NNs and an iterative model-based image reconstruction (MBIR) algorithm, often leading to both good generalization capability and outperforming reconstruction quality over existing MBIR optimization models. This paper proposes the first fast and convergent INN architecture, Momentum-Net, by generalizing a block-wise MBIR algorithm that uses momentum and majorizers with regression NNs. For fast MBIR, Momentum-Net uses momentum terms in extrapolation modules, and noniterative MBIR modules at each iteration by using majorizers, where each iteration of Momentum-Net consists of three core modules: image refining, extrapolation, and MBIR. Momentum-Net guarantees convergence to a fixed-point for general differentiable (non)convex MBIR functions (or data-fit terms) and convex feasible sets, under two asymptomatic conditions. To consider data-fit variations across training and testing samples, we also propose a regularization parameter selection scheme based on the “spectral spread” of majorization matrices. Numerical experiments for light-field photography using a focal stack and sparse-view computational tomography demonstrate that, given identical regression NN architectures, Momentum-Net significantly improves MBIR speed and accuracy over several existing INNs; it significantly improves reconstruction quality compared to a state-of-the-art MBIR method in each application
; ; ;
Award ID(s):
Publication Date:
Journal Name:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The goal of this study is to develop a new computed tomography (CT) image reconstruction method, aiming at improving the quality of the reconstructed images of existing methods while reducing computational costs. Existing CT reconstruction is modeled by pixel-based piecewise constant approximations of the integral equation that describes the CT projection data acquisition process. Using these approximations imposes a bottleneck model error and results in a discrete system of a large size. We propose to develop a content-adaptive unstructured grid (CAUG) based regularized CT reconstruction method to address these issues. Specifically, we design a CAUG of the image domain to sparsely represent the underlying image, and introduce a CAUG-based piecewise linear approximation of the integral equation by employing a collocation method. We further apply a regularization defined on the CAUG for the resulting ill-posed linear system, which may lead to a sparse linear representation for the underlying solution. The regularized CT reconstruction is formulated as a convex optimization problem, whose objective function consists of a weighted least square norm based fidelity term, a regularization term and a constraint term. Here, the corresponding weighted matrix is derived from the simultaneous algebraic reconstruction technique (SART). We then develop a SART-type preconditioned fixed-pointmore »proximity algorithm to solve the optimization problem. Convergence analysis is provided for the resulting iterative algorithm. Numerical experiments demonstrate the superiority of the proposed method over several existing methods in terms of both suppressing noise and reducing computational costs. These methods include the SART without regularization and with the quadratic regularization, the traditional total variation (TV) regularized reconstruction method and the TV superiorized conjugate gradient method on the pixel grid.« less
  2. This paper presents a policy-driven sequential image augmentation approach for image-related tasks. Our approach applies a sequence of image transformations (e.g., translation, rotation) over a training image, one transformation at a time, with the augmented image from the previous time step treated as the input for the next transformation. This sequential data augmentation substantially improves sample diversity, leading to improved test performance, especially for data-hungry models (e.g., deep neural networks). However, the search for the optimal transformation of each image at each time step of the sequence has high complexity due to its combination nature. To address this challenge, we formulate the search task as a sequential decision process and introduce a deep policy network that learns to produce transformations based on image content. We also develop an iterative algorithm to jointly train a classifier and the policy network in the reinforcement learning setting. The immediate reward of a potential transformation is defined to encourage transformations producing hard samples for the current classifier. At each iteration, we employ the policy network to augment the training dataset, train a classifier with the augmented data, and train the policy net with the aid of the classifier. We apply the above approach tomore »both public image classification benchmarks and a newly collected image dataset for material recognition. Comparisons to alternative augmentation approaches show that our policy-driven approach achieves comparable or improved classification performance while using significantly fewer augmented images. The code is available at« less
  3. In this paper, we present a framework to learn illumination patterns to improve the quality of signal recovery for coded diffraction imaging. We use an alternating minimization-based phase retrieval method with a fixed number of iterations as the iterative method. We represent the iterative phase retrieval method as an unrolled network with a fixed number of layers where each layer of the network corresponds to a single step of iteration, and we minimize the recovery error by optimizing over the illumination patterns. Since the number of iterations/layers is fixed, the recovery has a fixed computational cost. Extensive experimental results on a variety of datasets demonstrate that our proposed method significantly improves the quality of image reconstruction at a fixed computational cost with illumination patterns learned only using a small number of training images.
  4. Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, the current iterative-optimization-based reconstruction algorithms produce noisier and perceptually poorer images. In this work, we propose a non-iterative deep learning-based reconstruction approach that results in orders of magnitude improvement in image quality for lensless reconstructions. Our approach, called FlatNet, lays down a framework for reconstructing high-quality photorealistic images from mask-based lensless cameras, where the camera's forward model formulation is known. FlatNet consists of two stages: (1) an inversion stage that maps the measurement into a space of intermediate reconstruction by learning parameters within the forward model formulation, and (2) a perceptual enhancement stage that improves the perceptual quality of this intermediate reconstruction. These stages are trained together in an end-to-end manner. We show high-quality reconstructions by performing extensive experiments on real and challenging scenes using two different types of lensless prototypes: one which uses a separable forward model and another, which uses a more general non-separable cropped-convolution model. Our end-to-end approach is fast, produces photorealistic reconstructions, and is easy tomore »adopt for other mask-based lensless cameras.« less
  5. Deconvolution can be used to obtain sharp images or volumes from blurry or encoded measurements in imaging systems. Given knowledge of the system’s point spread function (PSF) over the field of view, a reconstruction algorithm can be used to recover a clear image or volume. Most deconvolution algorithms assume shift-invariance; however, in realistic systems, the PSF varies laterally and axially across the field of view due to aberrations or design. Shift-varying models can be used, but are often slow and computationally intensive. In this work, we propose a deep-learning-based approach that leverages knowledge about the system’s spatially varying PSFs for fast 2D and 3D reconstructions. Our approach, termed MultiWienerNet, uses multiple differentiable Wiener filters paired with a convolutional neural network to incorporate spatial variance. Trained using simulated data and tested on experimental data, our approach offers a625−<#comment/>1600×<#comment/>increase in speed compared to iterative methods with a spatially varying model, and outperforms existing deep-learning-based methods that assume shift invariance.