skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On the Convergence of NEAR-DGD for Nonconvex Optimization with Second Order Guarantees
We consider the setting where the nodes in an undirected, connected network collaborate to solve a shared objective modeled as the sum of smooth functions. We assume that each summand is privately known by a unique node. NEAR-DGD is a distributed first order method which permits adjusting the amount of communication between nodes relative to the amount of computation performed locally in order to balance convergence accuracy and total application cost. In this work, we generalize the convergence properties of a variant of NEAR-DGD from the strongly convex to the nonconvex case. Under mild assumptions, we show convergence to minimizers of a custom Lyapunov function. Moreover, we demonstrate that the gap between those minimizers and the second order stationary solutions of the original problem can become arbitrarily small depending on the choice of algorithm parameters. Finally, we accompany our theoretical analysis with a numerical experiment to evaluate the empirical performance of NEAR-DGD in the nonconvex setting.  more » « less
Award ID(s):
2024774
PAR ID:
10353351
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2021 60th IEEE Conference on Decision and Control (CDC)
Page Range / eLocation ID:
259 to 264
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We propose a new discontinuous Galerkin (dG) method for a geometrically nonlinear Kirchhoff plate model for large isometric bending deformations. The minimization problem is nonconvex due to the isometry constraint. We present a practical discrete gradient flow that decreases the energy and computes discrete minimizers that satisfy a prescribed discrete isometry defect. We prove [Formula: see text]-convergence of the discrete energies and discrete global minimizers. We document the flexibility and accuracy of the dG method with several numerical experiments. 
    more » « less
  2. Abstract A local discontinuous Galerkin (LDG) method for approximating large deformations of prestrained plates is introduced and tested on several insightful numerical examples in Bonito et al. (2022, LDG approximation of large deformations of prestrained plates. J. Comput. Phys., 448, 110719). This paper presents a numerical analysis of this LDG method, focusing on the free boundary case. The problem consists of minimizing a fourth-order bending energy subject to a nonlinear and nonconvex metric constraint. The energy is discretized using LDG and a discrete gradient flow is used for computing discrete minimizers. We first show $$\varGamma $$-convergence of the discrete energy to the continuous one. Then we prove that the discrete gradient flow decreases the energy at each step and computes discrete minimizers with control of the metric constraint defect. We also present a numerical scheme for initialization of the gradient flow and discuss the conditional stability of it. 
    more » « less
  3. Stochastic (sub)gradient methods require step size schedule tuning to perform well in practice. Classical tuning strategies decay the step size polynomially and lead to optimal sublinear rates on (strongly) convex problems. An alternative schedule, popular in nonconvex optimization, is called geometric step decay and proceeds by halving the step size after every few epochs. In recent work, geometric step decay was shown to improve exponentially upon classical sublinear rates for the class of sharp convex functions. In this work, we ask whether geometric step decay similarly improves stochastic algorithms for the class of sharp weakly convex problems. Such losses feature in modern statistical recovery problems and lead to a new challenge not present in the convex setting: the region of convergence is local, so one must bound the probability of escape. Our main result shows that for a large class of stochastic, sharp, nonsmooth, and nonconvex problems a geometric step decay schedule endows well-known algorithms with a local linear (or nearly linear) rate of convergence to global minimizers. This guarantee applies to the stochastic projected subgradient, proximal point, and prox-linear algorithms. As an application of our main result, we analyze two statistical recovery tasks—phase retrieval and blind deconvolution—and match the best known guarantees under Gaussian measurement models and establish new guarantees under heavy-tailed distributions. 
    more » « less
  4. Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this project, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed. 
    more » « less
  5. Demeter, Ciprian (Ed.)
    Given an image u_0, the aim of minimising the Mumford-Shah functional is to find a decomposition of the image domain into sub-domains and a piecewise smooth approximation u of u_0 such that u varies smoothly within each sub-domain. Since the Mumford-Shah functional is highly non- smooth, regularizations such as the Ambrosio-Tortorelli approximation can be considered, which is one of the most computationally efficient approximations of the Mumford-Shah functional for image segmentation. While very impressive numerical results have been achieved in a large range of applications when minimising the functional, no analytical results are currently available for minimizers of the functional in the piece- wise smooth setting, and this is the goal of this work. Our main result is the Γ-convergence of the Ambrosio-Tortorelli approximation of the Mumford-Shah functional for piecewise smooth approximations. This requires the introduction of an appropriate function space. As a consequence of our Gamma-convergence result, we can infer the convergence of minimizers of the respective functionals. 
    more » « less