- Award ID(s):
- 1760102
- NSF-PAR ID:
- 10228445
- Date Published:
- Journal Name:
- Mathematical Programming
- ISSN:
- 0025-5610
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Dasgupta, Sanjoy ; Mandt, Stephan ; Li, Yingzhen (Ed.)We study accelerated optimization methods in the Gaussian phase retrieval problem. In this setting, we prove that gradient methods with Polyak or Nesterov momentum have similar implicit regularization to gradient descent. This implicit regularization ensures that the algorithms remain in a nice region, where the cost function is strongly convex and smooth despite being nonconvex in general. This ensures that these accelerated methods achieve faster rates of convergence than gradient descent. Experimental evidence demonstrates that the accelerated methods converge faster than gradient descent in practice.more » « less
-
We consider an in-network optimal resource allocation problem in which a group of agents interacting over a connected graph want to meet a demand while minimizing their collective cost. The contribution of this paper is to design a distributed continuous-time algorithm for this problem inspired by a recently developed first-order transformed primal-dual method. The solution applies to cluster-based setting where each agent may have a set of subagents, and its local cost is the sum of the cost of these subagents. The proposed algorithm guarantees an exponential convergence for strongly convex costs and asymptotic convergence for convex costs. Exponential convergence when the local cost functions are strongly convex is achieved even when the local gradients are only locally Lipschitz. For convex local cost functions, our algorithm guarantees asymptotic convergence to a point in the minimizer set. Through numerical examples, we show that our proposed algorithm delivers a faster convergence compared to existing distributed resource allocation algorithms.more » « less
-
null (Ed.)In this paper we consider the problem of minimizing composite objective functions consisting of a convex differentiable loss function plus a non-smooth regularization term, such as $L_1$ norm or nuclear norm, under Rényi differential privacy (RDP). To solve the problem, we propose two stochastic alternating direction method of multipliers (ADMM) algorithms: ssADMM based on gradient perturbation and mpADMM based on output perturbation. Both algorithms decompose the original problem into sub-problems that have closed-form solutions. The first algorithm, ssADMM, applies the recent privacy amplification result for RDP to reduce the amount of noise to add. The second algorithm, mpADMM, numerically computes the sensitivity of ADMM variable updates and releases the updated parameter vector at the end of each epoch. We compare the performance of our algorithms with several baseline algorithms on both real and simulated datasets. Experimental results show that, in high privacy regimes (small ε), ssADMM and mpADMM outperform baseline algorithms in terms of classification and feature selection performance, respectively.more » « less
-
We develop a projected Nesterov’s proximal-gradient (PNPG) approach for sparse signal reconstruction that combines adaptive step size with Nesterov’s momentum acceleration. The objective function that we wish to minimize is the sum of a convex differentiable data-fidelity (negative log-likelihood (NLL)) term and a convex regularization term. We apply sparse signal regularization where the signal belongs to a closed convex set within the closure of the domain of the NLL; the convex-set constraint facilitates flexible NLL domains and accurate signal recovery. Signal sparsity is imposed using the ℓ₁-norm penalty on the signal’s linear transform coefficients. The PNPG approach employs a projected Nesterov’s acceleration step with restart and a duality-based inner iteration to compute the proximal mapping. We propose an adaptive step-size selection scheme to obtain a good local majorizing function of the NLL and reduce the time spent backtracking. Thanks to step-size adaptation, PNPG converges faster than the methods that do not adjust to the local curvature of the NLL. We present an integrated derivation of the momentum acceleration and proofs of O(k⁻²) objective function convergence rate and convergence of the iterates, which account for adaptive step size, inexactness of the iterative proximal mapping, and the convex-set constraint. The tuning of PNPG is largely application independent. Tomographic and compressed-sensing reconstruction experiments with Poisson generalized linear and Gaussian linear measurement models demonstrate the performance of the proposed approach.more » « less
-
Abstract We study
inexact fixed-point proximity algorithms for solving a class of sparse regularization problems involving the norm. Specifically, the$$\ell _0$$ model has an objective function that is the sum of a convex fidelity term and a Moreau envelope of the$$\ell _0$$ norm regularization term. Such an$$\ell _0$$ model is non-convex. Existing exact algorithms for solving the problems require the availability of closed-form formulas for the proximity operator of convex functions involved in the objective function. When such formulas are not available, numerical computation of the proximity operator becomes inevitable. This leads to inexact iteration algorithms. We investigate in this paper how the numerical error for every step of the iteration should be controlled to ensure global convergence of the inexact algorithms. We establish a theoretical result that guarantees the sequence generated by the proposed inexact algorithm converges to a local minimizer of the optimization problem. We implement the proposed algorithms for three applications of practical importance in machine learning and image science, which include regression, classification, and image deblurring. The numerical results demonstrate the convergence of the proposed algorithm and confirm that local minimizers of the$$\ell _0$$ models found by the proposed inexact algorithm outperform global minimizers of the corresponding$$\ell _0$$ models, in terms of approximation accuracy and sparsity of the solutions.$$\ell _1$$