skip to main content


Title: Momentum-Net: Fast and convergent iterative neural network for inverse problems
Iterative neural networks (INN) are rapidly gaining attention for solving inverse problems in imaging, image processing, and computer vision. INNs combine regression NNs and an iterative model-based image reconstruction (MBIR) algorithm, often leading to both good generalization capability and outperforming reconstruction quality over existing MBIR optimization models. This paper proposes the first fast and convergent INN architecture, Momentum-Net, by generalizing a block-wise MBIR algorithm that uses momentum and majorizers with regression NNs. For fast MBIR, Momentum-Net uses momentum terms in extrapolation modules, and noniterative MBIR modules at each iteration by using majorizers, where each iteration of Momentum-Net consists of three core modules: image refining, extrapolation, and MBIR. Momentum-Net guarantees convergence to a fixed-point for general differentiable (non)convex MBIR functions (or data-fit terms) and convex feasible sets, under two asymptomatic conditions. To consider data-fit variations across training and testing samples, we also propose a regularization parameter selection scheme based on the “spectral spread” of majorization matrices. Numerical experiments for light-field photography using a focal stack and sparse-view computational tomography demonstrate that, given identical regression NN architectures, Momentum-Net significantly improves MBIR speed and accuracy over several existing INNs; it significantly improves reconstruction quality compared to a state-of-the-art MBIR method in each application  more » « less
Award ID(s):
1838179
NSF-PAR ID:
10309167
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE Transactions on Pattern Analysis and Machine Intelligence
ISSN:
0162-8828
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract The goal of this study is to develop a new computed tomography (CT) image reconstruction method, aiming at improving the quality of the reconstructed images of existing methods while reducing computational costs. Existing CT reconstruction is modeled by pixel-based piecewise constant approximations of the integral equation that describes the CT projection data acquisition process. Using these approximations imposes a bottleneck model error and results in a discrete system of a large size. We propose to develop a content-adaptive unstructured grid (CAUG) based regularized CT reconstruction method to address these issues. Specifically, we design a CAUG of the image domain to sparsely represent the underlying image, and introduce a CAUG-based piecewise linear approximation of the integral equation by employing a collocation method. We further apply a regularization defined on the CAUG for the resulting ill-posed linear system, which may lead to a sparse linear representation for the underlying solution. The regularized CT reconstruction is formulated as a convex optimization problem, whose objective function consists of a weighted least square norm based fidelity term, a regularization term and a constraint term. Here, the corresponding weighted matrix is derived from the simultaneous algebraic reconstruction technique (SART). We then develop a SART-type preconditioned fixed-point proximity algorithm to solve the optimization problem. Convergence analysis is provided for the resulting iterative algorithm. Numerical experiments demonstrate the superiority of the proposed method over several existing methods in terms of both suppressing noise and reducing computational costs. These methods include the SART without regularization and with the quadratic regularization, the traditional total variation (TV) regularized reconstruction method and the TV superiorized conjugate gradient method on the pixel grid. 
    more » « less
  2. null (Ed.)
    This paper presents a policy-driven sequential image augmentation approach for image-related tasks. Our approach applies a sequence of image transformations (e.g., translation, rotation) over a training image, one transformation at a time, with the augmented image from the previous time step treated as the input for the next transformation. This sequential data augmentation substantially improves sample diversity, leading to improved test performance, especially for data-hungry models (e.g., deep neural networks). However, the search for the optimal transformation of each image at each time step of the sequence has high complexity due to its combination nature. To address this challenge, we formulate the search task as a sequential decision process and introduce a deep policy network that learns to produce transformations based on image content. We also develop an iterative algorithm to jointly train a classifier and the policy network in the reinforcement learning setting. The immediate reward of a potential transformation is defined to encourage transformations producing hard samples for the current classifier. At each iteration, we employ the policy network to augment the training dataset, train a classifier with the augmented data, and train the policy net with the aid of the classifier. We apply the above approach to both public image classification benchmarks and a newly collected image dataset for material recognition. Comparisons to alternative augmentation approaches show that our policy-driven approach achieves comparable or improved classification performance while using significantly fewer augmented images. The code is available at https://github.com/Paul-LiPu/rl_autoaug. 
    more » « less
  3. In this paper, we present a framework to learn illumination patterns to improve the quality of signal recovery for coded diffraction imaging. We use an alternating minimization-based phase retrieval method with a fixed number of iterations as the iterative method. We represent the iterative phase retrieval method as an unrolled network with a fixed number of layers where each layer of the network corresponds to a single step of iteration, and we minimize the recovery error by optimizing over the illumination patterns. Since the number of iterations/layers is fixed, the recovery has a fixed computational cost. Extensive experimental results on a variety of datasets demonstrate that our proposed method significantly improves the quality of image reconstruction at a fixed computational cost with illumination patterns learned only using a small number of training images. 
    more » « less
  4. null (Ed.)
    Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, the current iterative-optimization-based reconstruction algorithms produce noisier and perceptually poorer images. In this work, we propose a non-iterative deep learning-based reconstruction approach that results in orders of magnitude improvement in image quality for lensless reconstructions. Our approach, called FlatNet, lays down a framework for reconstructing high-quality photorealistic images from mask-based lensless cameras, where the camera's forward model formulation is known. FlatNet consists of two stages: (1) an inversion stage that maps the measurement into a space of intermediate reconstruction by learning parameters within the forward model formulation, and (2) a perceptual enhancement stage that improves the perceptual quality of this intermediate reconstruction. These stages are trained together in an end-to-end manner. We show high-quality reconstructions by performing extensive experiments on real and challenging scenes using two different types of lensless prototypes: one which uses a separable forward model and another, which uses a more general non-separable cropped-convolution model. Our end-to-end approach is fast, produces photorealistic reconstructions, and is easy to adopt for other mask-based lensless cameras. 
    more » « less
  5. Purpose

    To develop and evaluate a simultaneous multislice (SMS) reconstruction technique that provides noise reduction and leakage blocking for highly accelerated cardiac MRI.

    Methods

    ReadOutConcatenatedk‐space SPIRiT (ROCK‐SPIRiT) uses the concept of readout concatenation in image domain to represent SMS encoding, and performs coil self‐consistency as in SPIRiT‐type reconstruction in an extended k‐space, while allowing regularization for further denoising. The proposed method is implemented with and without regularization, and validated on retrospectively SMS‐accelerated cine imaging with three‐fold SMS and two‐fold in‐plane acceleration. ROCK‐SPIRiT is compared with two leakage‐blocking SMS reconstruction methods: readout‐SENSE‐GRAPPA and split slice–GRAPPA. Further evaluation and comparisons are performed using prospectively SMS‐accelerated cine imaging.

    Results

    Results on retrospectively three‐fold SMS and two‐fold in‐plane accelerated cine imaging show that ROCK‐SPIRiT without regularization significantly improves on existing methods in terms of PSNR (readout‐SENSE‐GRAPPA: 33.5 ± 3.2, split slice–GRAPPA: 34.1 ± 3.8, ROCK‐SPIRiT: 35.0 ± 3.3) and SSIM (readout‐SENSE‐GRAPPA: 84.4 ± 8.9, split slice–GRAPPA: 85.0 ± 8.9, ROCK‐SPIRiT: 88.2 ± 6.6 [in percentage]). Regularized ROCK‐SPIRiT significantly outperforms all methods, as characterized by these quantitative metrics (PSNR: 37.6 ± 3.8, SSIM: 94.2 ± 4.1 [in percentage]). The prospectively five‐fold SMS and two‐fold in‐plane accelerated data show that ROCK‐SPIRiT and regularized ROCK‐SPIRiT have visually improved image quality compared with existing methods.

    Conclusion

    The proposed ROCK‐SPIRiT technique reduces noise and interslice leakage in accelerated SMS cardiac cine MRI, improving on existing methods both quantitatively and qualitatively.

     
    more » « less