skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Second Order Adjoints in Optimization
Second order, Newton-like algorithms exhibit convergence properties superior to gradient-based or derivative-free optimization algorithms. However, deriving and computing second order derivatives--needed for the Hessian-vector product in a Krylov iteration for the Newton step--often is not trivial. Second order adjoints provide a systematic and efficient tool to derive second derivative infor- mation. In this paper, we consider equality constrained optimization problems in an infinite-dimensional setting. We phrase the optimization problem in a general Banach space framework and derive second order sensitivities and second order adjoints in a rigorous and general way. We apply the developed framework to a partial differential equation-constrained optimization problem.  more » « less
Award ID(s):
1654311
PAR ID:
10352319
Author(s) / Creator(s):
;
Editor(s):
Al-Baali, Mehiddin; Purnama, Anton; Grandinetti, Lucio
Date Published:
Journal Name:
Numerical Analysis and Optimization
Volume:
354
Page Range / eLocation ID:
209-230
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Newton's method is usually preferred when solving optimization problems due to its superior convergence properties compared to gradient-based or derivative-free optimization algorithms. However, deriving and computing second-order derivatives needed by Newton's method often is not trivial and, in some cases, not possible. In such cases quasi-Newton algorithms are a great alternative. In this paper, we provide a new derivation of well-known quasi-Newton formulas in an infinite-dimensional Hilbert space setting. It is known that quasi-Newton update formulas are solutions to certain variational problems over the space of symmetric matrices. In this paper, we formulate similar variational problems over the space of bounded symmetric operators in Hilbert spaces. By changing the constraints of the variational problem we obtain updates (for the Hessian and Hessian inverse) not only for the Broyden-Fletcher-Goldfarb-Shanno (BFGS) quasi-Newton method but also for Davidon--Fletcher--Powell (DFP), Symmetric Rank One (SR1), and Powell-Symmetric-Broyden (PSB). In addition, for an inverse problem governed by a partial differential equation (PDE), we derive DFP and BFGS ``structured" secant formulas that explicitly use the derivative of the regularization and only approximates the second derivative of the misfit term. We show numerical results that demonstrate the desired mesh-independence property and superior performance of the resulting quasi-Newton methods. 
    more » « less
  2. In this article, we study linearly constrained policy optimization over the manifold of Schur stabilizing controllers, equipped with a Riemannian metric that emerges naturally in the context of optimal control problems. We provide extrinsic analysis of a generic constrained smooth cost function that subsequently facilitates subsuming any such constrained problem into this framework. By studying the second-order geometry of this manifold, we provide a Newton-type algorithm that does not rely on the exponential mapping nor a retraction, while ensuring local convergence guarantees. The algorithm hinges instead upon the developed stability certificate and the linear structure of the constraints. We then apply our methodology to two well-known constrained optimal control problems. Finally, several numerical examples showcase the performance of the proposed algorithm. 
    more » « less
  3. SUMMARY This paper revisits and extends the adjoint theory for glacial isostatic adjustment (GIA) of Crawford et al. (2018). Rotational feedbacks are now incorporated, and the application of the second-order adjoint method is described for the first time. The first-order adjoint method provides an efficient means for computing sensitivity kernels for a chosen objective functional, while the second-order adjoint method provides second-derivative information in the form of Hessian kernels. These latter kernels are required by efficient Newton-type optimization schemes and within methods for quantifying uncertainty for non-linear inverse problems. Most importantly, the entire theory has been reformulated so as to simplify its implementation by others within the GIA community. In particular, the rate-formulation for the GIA forward problem introduced by Crawford et al. (2018) has been replaced with the conventional equations for modelling GIA in laterally heterogeneous earth models. The implementation of the first- and second-order adjoint problems should be relatively easy within both existing and new GIA codes, with only the inclusions of more general force terms being required. 
    more » « less
  4. Abstract We consider variants of a recently developed Newton-CG algorithm for nonconvex problems (Royer, C. W. & Wright, S. J. (2018) Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477) in which inexact estimates of the gradient and the Hessian information are used for various steps. Under certain conditions on the inexactness measures, we derive iteration complexity bounds for achieving $$\epsilon $$-approximate second-order optimality that match best-known lower bounds. Our inexactness condition on the gradient is adaptive, allowing for crude accuracy in regions with large gradients. We describe two variants of our approach, one in which the step size along the computed search direction is chosen adaptively, and another in which the step size is pre-defined. To obtain second-order optimality, our algorithms will make use of a negative curvature direction on some steps. These directions can be obtained, with high probability, using the randomized Lanczos algorithm. In this sense, all of our results hold with high probability over the run of the algorithm. We evaluate the performance of our proposed algorithms empirically on several machine learning models. Our approach is a first attempt to introduce inexact Hessian and/or gradient information into the Newton-CG algorithm of Royer & Wright (2018, Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim., 28, 1448–1477). 
    more » « less
  5. This paper addresses the study of a new class of nonsmooth optimization prob lems, where the objective is represented as a difference of two generally nonconvex functions. We propose and develop a novel Newton-type algorithm to solving such problems, which is based on the coderivative generated second-order subdifferential (generalized Hessian) and employs advanced tools of variational analysis. Well posedness properties of the proposed algorithm are derived under fairly general requirements, while constructive convergence rates are established by using additional assumptions including the Kurdyka–Łojasiewicz condition. We provide applications of the main algorithm to solving a general class of nonsmooth nonconvex problems of structured optimization that encompasses, in particular, optimization problems with explicit constraints. Finally, applications and numerical experiments are given for solving practical problems that arise in biochemical models, supervised learning, constrained quadratic programming, etc., where advantages of our algorithms are demonstrated in comparison with some known techniques and results. 
    more » « less