Newton's method is usually preferred when solving optimization
problems due to its superior convergence properties compared to
gradientbased or derivativefree optimization algorithms. However,
deriving and computing secondorder derivatives needed by Newton's
method often is not trivial and, in some cases, not possible. In such
cases quasiNewton algorithms are a great alternative. In this paper,
we provide a new derivation of wellknown quasiNewton formulas in an
infinitedimensional Hilbert space setting. It is known that
quasiNewton update formulas are solutions to certain variational
problems over the space of symmetric matrices. In this paper, we
formulate similar variational problems over the space of bounded
symmetric operators in Hilbert spaces. By changing the constraints of
the variational problem we obtain updates (for the Hessian and Hessian
inverse) not only for the BroydenFletcherGoldfarbShanno (BFGS)
quasiNewton method but also for DavidonFletcherPowell (DFP),
Symmetric Rank One (SR1), and PowellSymmetricBroyden (PSB). In
addition, for an inverse problem governed by a partial differential
equation (PDE), we derive DFP and BFGS ``structured" secant formulas
that explicitly use the derivative of the regularization and only
approximates the second derivative of the misfit term. We show
numerical results that demonstrate the desired meshindependence
property and superior performance of the resulting quasiNewton
methods.
more »
« less
Reduced Order Model Hessian Approximations in Newton Methods for Optimal Control
This paper introduces reduced order model (ROM) based Hessian approximations for use in inexact Newton methods for the solution of optimization problems implicitly constrained by a largescale system, typically a discretization of a partial differential equation (PDE). The direct application of an inexact Newton method to this problem requires the solution of many PDEs per optimization iteration. To reduce the computational complexity, a ROM Hessian approximation is proposed. Since only the Hessian is approximated, but the original objective function and its gradient is used, the resulting inexact Newton method maintains the firstorder global convergence property, under suitable assumptions. Thus even computationally inexpensive lower fidelity ROMs can be used, which is different from ROM approaches that replace the original optimization problem by a sequence of ROM optimization problem and typically need to accurately approximate function and gradient information of the original problem. In the proposed approach, the quality of the ROM Hessian approximation determines the rate of convergence, but not whether the method converges. The projection based ROM is constructed from state and adjoint snapshots, and is relatively inexpensive to compute. Numerical examples on semilinear parabolic optimal control problems demonstrate that the proposed approach can lead to substantial savings in terms of overall PDE solves required.
more »
« less
 NSFPAR ID:
 10345793
 Editor(s):
 Beattie, C.A.; Benner, P.; Embree, M.; Gugercin, S.; Lefteriu, S.
 Date Published:
 Journal Name:
 Realization and Model Reduction of Dynamical Systems  A Festschrift in Honor of the 70th Birthday of Thanos Antoulas
 Page Range / eLocation ID:
 335  351
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


QuasiNewton algorithms are among the most popular iterative methods for solving unconstrained minimization problems, largely due to their favorable superlinear convergence property. However, existing results for these algorithms are limited as they provide either (i) a global convergence guarantee with an asymptotic superlinear convergence rate, or (ii) a local nonasymptotic superlinear rate for the case that the initial point and the initial Hessian approximation are chosen properly. In particular, no current analysis for quasiNewton methods guarantees global convergence with an explicit superlinear convergence rate. In this paper, we close this gap and present the first globally convergent quasiNewton method with an explicit non asymptotic superlinear convergence rate. Unlike classical quasiNewton methods, we build our algorithm upon the hybrid proximal extragradient method and propose a novel online learning framework for updating the Hessian approximation matrices. Specifically, guided by the convergence analysis, we formulate the Hessian approximation update as an online convex optimization problem in the space of matrices, and we relate the bounded regret of the online problem to the superlinear convergence of our method.more » « less

Krylov Cubic Regularized Newton: A Subspace SecondOrder Method with DimensionFree Convergence RateSecondorder optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in highdimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second order updates within a lowerdimensional subspace, giving rise to \textit{subspace secondorder} methods. However, the majority of existing subspace secondorder methods randomly select subspaces, consequently resulting in slower convergence rates depending on the problem's dimension $d$. In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimensionindependent global convergence rate of $\bigO\left(\frac{1}{mk}+\frac{1}{k^2}\right)$ for solving convex optimization problems. Here, $m$ represents the subspace dimension, which can be significantly smaller than $d$. Instead of adopting a random subspace, our primary innovation involves performing the cubic regularized Newton update within the \emph{Krylov subspace} associated with the Hessian and the gradient of the objective function. This result marks the first instance of a dimensionindependent convergence rate for a subspace secondorder method. Furthermore, when specific spectral conditions of the Hessian are met, our method recovers the convergence rate of a fulldimensional cubic regularized Newton method. Numerical experiments show our method converges faster than existing random subspace methods, especially for highdimensional problems.more » « less

null (Ed.)We present an extensible software framework, hIPPYlib, for solution of largescale deterministic and Bayesian inverse problems governed by partial differential equations (PDEs) with (possibly) infinitedimensional parameter fields (which are highdimensional after discretization). hIPPYlib overcomes the prohibitively expensive nature of Bayesian inversion for this class of problems by implementing stateoftheart scalable algorithms for PDEbased inverse problems that exploit the structure of the underlying operators, notably the Hessian of the logposterior. The key property of the algorithms implemented in hIPPYlib is that the solution of the inverse problem is computed at a cost, measured in linearized forward PDE solves, that is independent of the parameter dimension. The mean of the posterior is approximated by the MAP point, which is found by minimizing the negative logposterior with an inexact matrixfree NewtonCG method. The posterior covariance is approximated by the inverse of the Hessian of the negative log posterior evaluated at the MAP point. The construction of the posterior covariance is made tractable by invoking a lowrank approximation of the Hessian of the loglikelihood. Scalable tools for sample generation are also discussed. hIPPYlib makes all of these advanced algorithms easily accessible to domain scientists and provides an environment that expedites the development of new algorithms.more » « less

null (Ed.)Full waveform inversion (FWI) and leastsquares reverse time migration (LSRTM) are popular imaging techniques that can be solved as PDEconstrained optimization problems. Due to the largescale nature, gradient and Hessianbased optimization algorithms are preferred in practice to find the optimizer iteratively. However, a balance between the evaluation cost and the rate of convergence needs to be considered. We propose the use of Anderson acceleration (AA), a popular strategy to speed up the convergence of fixedpoint iterations, to accelerate a gradient descent method. We show that AA can achieve fast convergence that provides competitive results with some quasiNewton methods. Independent of the dimensionality of the unknown parameters, the computational cost of implementing the method can be reduced to an extremely lowdimensional leastsquares problem, which makes AA an attractive method for seismic inversion.more » « less