skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Anderson acceleration for contractive and noncontractive operators
Abstract A one-step analysis of Anderson acceleration with general algorithmic depths is presented. The resulting residual bounds within both contractive and noncontractive settings reveal the balance between the contributions from the higher and lower order terms, which are both dependent on the success of the optimization problem solved at each step of the algorithm. The new residual bounds show the additional terms introduced by the extrapolation produce terms that are of a higher order than was previously understood. In the contractive setting these bounds sharpen previous convergence and acceleration results. The bounds rely on sufficient linear independence of the differences between consecutive residuals, rather than assumptions on the boundedness of the optimization coefficients, allowing the introduction of a theoretically sound safeguarding strategy. Several numerical tests illustrate the analysis primarily in the noncontractive setting, and demonstrate the use of the method, the safeguarding strategy and theory-based guidance on dynamic selection of the algorithmic depth, on a p-Laplace equation, a nonlinear Helmholtz equation and the steady Navier–Stokes equations with high Reynolds number in three spatial dimensions.  more » « less
Award ID(s):
2011490
PAR ID:
10327784
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IMA Journal of Numerical Analysis
Volume:
41
Issue:
4
ISSN:
0272-4979
Page Range / eLocation ID:
2841 to 2872
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    State-of-the-art seismic imaging techniques treat inversion tasks such as full-waveform inversion (FWI) and least-squares reverse time migration (LSRTM) as partial differential equation-constrained optimization problems. Due to the large-scale nature, gradient-based optimization algorithms are preferred in practice to update the model iteratively. Higher-order methods converge in fewer iterations but often require higher computational costs, more line-search steps, and bigger memory storage. A balance among these aspects has to be considered. We have conducted an evaluation using Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate the steepest-descent algorithm, which we innovatively treat as a fixed-point iteration. Independent of the unknown parameter dimensionality, the computational cost of implementing the method can be reduced to an extremely low dimensional least-squares problem. The cost can be further reduced by a low-rank update. We determine the theoretical connections and the differences between AA and other well-known optimization methods such as L-BFGS and the restarted generalized minimal residual method and compare their computational cost and memory requirements. Numerical examples of FWI and LSRTM applied to the Marmousi benchmark demonstrate the acceleration effects of AA. Compared with the steepest-descent method, AA can achieve faster convergence and can provide competitive results with some quasi-Newton methods, making it an attractive optimization strategy for seismic inversion. 
    more » « less
  2. In this paper we develop convergence and acceleration theory for Anderson acceleration applied to Newton’s method for nonlinear systems in which the Jacobian is singular at a solution. For these problems, the standard Newton algorithm converges linearly in a region about the solution; and, it has been previously observed that Anderson acceleration can substantially improve convergence without additional a priori knowledge, and with little additional computation cost. We present an analysis of the Newton-Anderson algorithm in this context, and introduce a novel and theoretically supported safeguarding strategy. The convergence results are demonstrated with the Chandrasekhar H-equation and a variety of benchmark examples. 
    more » « less
  3. The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framework, we recover the existing bound for the gradient method~(GD), derive sharper bounds for the proximal point method~(PPM) and optimistic gradient method~(OG), and provide for the first time a global convergence rate for the negative momentum method~(NM) with an iteration complexity O(κ1.5), which matches its known lower bound. In addition, for time-varying systems, we prove that the gradient method with optimal step size achieves the fastest provable worst-case convergence rate with quadratic Lyapunov functions. Finally, we further extend our analysis to stochastic games and study the impact of multiplicative noise on different algorithms. We show that it is impossible for an algorithm with one step of memory to achieve acceleration if it only queries the gradient once per batch (in contrast with the stochastic strongly-convex optimization setting, where such acceleration has been demonstrated). However, we exhibit an algorithm which achieves acceleration with two gradient queries per batch. 
    more » « less
  4. Abstract This paper develops an efficient and robust solution technique for the steady Boussinesq model of non-isothermal flow using Anderson acceleration applied to a Picard iteration. After analyzing the fixed point operator associated with the nonlinear iteration to prove that certain stability and regularity properties hold, we apply the authors’ recently constructed theory for Anderson acceleration, which yields a convergence result for the Anderson accelerated Picard iteration for the Boussinesq system. The result shows that the leading term in the residual is improved by the gain in the optimization problem, but at the cost of additional higher order terms that can be significant when the residual is large. We perform numerical tests that illustrate the theory, and show that a 2-stage choice of Anderson depth can be advantageous. We also consider Anderson acceleration applied to the Newton iteration for the Boussinesq equations, and observe that the acceleration allows the Newton iteration to converge for significantly higher Rayleigh numbers that it could without acceleration, even with a standard line search. 
    more » « less
  5. Bounds on the smallest eigenvalue of the neural tangent kernel (NTK) are a key ingredient in the analysis of neural network optimization and memorization. How- ever, existing results require distributional assumptions on the data and are limited to a high-dimensional setting, where the input dimension d0 scales at least log- arithmically in the number of samples n. In this work we remove both of these requirements and instead provide bounds in terms of a measure of distance between data points: notably these bounds hold with high probability even when d0 is held constant versus n. We prove our results through a novel application of the hemisphere transform. 
    more » « less