skip to main content


Title: Algorithmic versatility of SPF-regularization methods
Sparsity promoting functions (SPFs) are commonly used in optimization problems to find solutions which are sparse in some basis. For example, the [Formula: see text]-regularized wavelet model and the Rudin–Osher–Fatemi total variation (ROF-TV) model are some of the most well-known models for signal and image denoising, respectively. However, recent work demonstrates that convexity is not always desirable in SPFs. In this paper, we replace convex SPFs with their induced nonconvex SPFs and develop algorithms for the resulting model by exploring the intrinsic structures of the nonconvex SPFs. These functions are defined as the difference of the convex SPF and its Moreau envelope. We also present simulations illustrating the performance of a special SPF and the developed algorithms in image denoising.  more » « less
Award ID(s):
1913039
NSF-PAR ID:
10276093
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Analysis and Applications
Volume:
19
Issue:
01
ISSN:
0219-5305
Page Range / eLocation ID:
43 to 69
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract Deep neural networks provide state-of-the-art performance for image denoising, where the goal is to recover a near noise-free image from a noisy observation. The underlying principle is that neural networks trained on large data sets have empirically been shown to be able to generate natural images well from a low-dimensional latent representation of the image. Given such a generator network, a noisy image can be denoised by (i) finding the closest image in the range of the generator or by (ii) passing it through an encoder-generator architecture (known as an autoencoder). However, there is little theory to justify this success, let alone to predict the denoising performance as a function of the network parameters. In this paper, we consider the problem of denoising an image from additive Gaussian noise using the two generator-based approaches. In both cases, we assume the image is well described by a deep neural network with ReLU activations functions, mapping a $k$-dimensional code to an $n$-dimensional image. In the case of the autoencoder, we show that the feedforward network reduces noise energy by a factor of $O(k/n)$. In the case of optimizing over the range of a generative model, we state and analyze a simple gradient algorithm that minimizes a non-convex loss function and provably reduces noise energy by a factor of $O(k/n)$. We also demonstrate in numerical experiments that this denoising performance is, indeed, achieved by generative priors learned from data. 
    more » « less
  2. Abstract

    We present a new feasible proximal gradient method for constrained optimization where both the objective and constraint functions are given by summation of a smooth, possibly nonconvex function and a convex simple function. The algorithm converts the original problem into a sequence of convex subproblems. Formulating those subproblems requires the evaluation of at most one gradient-value of the original objective and constraint functions. Either exact or approximate subproblems solutions can be computed efficiently in many cases. An important feature of the algorithm is the constraint level parameter. By carefully increasing this level for each subproblem, we provide a simple solution to overcome the challenge of bounding the Lagrangian multipliers and show that the algorithm follows a strictly feasible solution path till convergence to the stationary point. We develop a simple, proximal gradient descent type analysis, showing that the complexity bound of this new algorithm is comparable to gradient descent for the unconstrained setting which is new in the literature. Exploiting this new design and analysis technique, we extend our algorithms to some more challenging constrained optimization problems where (1) the objective is a stochastic or finite-sum function, and (2) structured nonsmooth functions replace smooth components of both objective and constraint functions. Complexity results for these problems also seem to be new in the literature. Finally, our method can also be applied to convex function constrained problems where we show complexities similar to the proximal gradient method.

     
    more » « less
  3. ABSTRACT

    We introduce a new class of iterative image reconstruction algorithms for radio interferometry, at the interface of convex optimization and deep learning, inspired by plug-and-play methods. The approach consists in learning a prior image model by training a deep neural network (DNN) as a denoiser, and substituting it for the handcrafted proximal regularization operator of an optimization algorithm. The proposed AIRI (‘AI for Regularization in radio-interferometric Imaging’) framework, for imaging complex intensity structure with diffuse and faint emission from visibility data, inherits the robustness and interpretability of optimization, and the learning power and speed of networks. Our approach relies on three steps. First, we design a low dynamic range training data base from optical intensity images. Secondly, we train a DNN denoiser at a noise level inferred from the signal-to-noise ratio of the data. We use training losses enhanced with a non-expansiveness term ensuring algorithm convergence, and including on-the-fly data base dynamic range enhancement via exponentiation. Thirdly, we plug the learned denoiser into the forward–backward optimization algorithm, resulting in a simple iterative structure alternating a denoising step with a gradient-descent data-fidelity step. We have validated AIRI against clean, optimization algorithms of the SARA family, and a DNN trained to reconstruct the image directly from visibility data. Simulation results show that AIRI is competitive in imaging quality with SARA and its unconstrained forward–backward-based version uSARA, while providing significant acceleration. clean remains faster but offers lower quality. The end-to-end DNN offers further acceleration, but with far lower quality than AIRI.

     
    more » « less
  4. In distributed machine learning, where agents collaboratively learn from diverse private data sets, there is a fundamental tension between consensus and optimality . In this paper, we build on recent algorithmic progresses in distributed deep learning to explore various consensus-optimality trade-offs over a fixed communication topology. First, we propose the incremental consensus -based distributed stochastic gradient descent (i-CDSGD) algorithm, which involves multiple consensus steps (where each agent communicates information with its neighbors) within each SGD iteration. Second, we propose the generalized consensus -based distributed SGD (g-CDSGD) algorithm that enables us to navigate the full spectrum from complete consensus (all agents agree) to complete disagreement (each agent converges to individual model parameters). We analytically establish convergence of the proposed algorithms for strongly convex and nonconvex objective functions; we also analyze the momentum variants of the algorithms for the strongly convex case. We support our algorithms via numerical experiments, and demonstrate significant improvements over existing methods for collaborative deep learning. 
    more » « less
  5. This work considers the minimization of a general convex function f (X) over the cone of positive semi-definite matrices whose optimal solution X* is of low-rank. Standard first-order convex solvers require performing an eigenvalue decomposition in each iteration, severely limiting their scalability. A natural nonconvex reformulation of the problem factors the variable X into the product of a rectangular matrix with fewer columns and its transpose. For a special class of matrix sensing and completion problems with quadratic objective functions, local search algorithms applied to the factored problem have been shown to be much more efficient and, in spite of being nonconvex, to converge to the global optimum. The purpose of this work is to extend this line of study to general convex objective functions f (X) and investigate the geometry of the resulting factored formulations. Specifically, we prove that when f (X) satisfies the restricted well-conditioned assumption, each critical point of the factored problem either corresponds to the optimal solution X* or a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation ensures that many local search algorithms can converge to the global optimum with random initializations. 
    more » « less