skip to main content


Title: A modified limited memory steepest descent method motivated by an inexact super-linear convergence rate analysis
Abstract How to choose the step size of gradient descent method has been a popular subject of research. In this paper we propose a modified limited memory steepest descent method (MLMSD). In each iteration we propose a selection rule to pick a unique step size from a candidate set, which is calculated by Fletcher’s limited memory steepest descent method (LMSD), instead of going through all the step sizes in a sweep, as in Fletcher’s original LMSD algorithm. MLMSD is motivated by an inexact super-linear convergence rate analysis. The R-linear convergence of MLMSD is proved for a strictly convex quadratic minimization problem. Numerical tests are presented to show that our algorithm is efficient and robust.  more » « less
Award ID(s):
1740833
NSF-PAR ID:
10203325
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IMA Journal of Numerical Analysis
ISSN:
0272-4979
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    State-of-the-art seismic imaging techniques treat inversion tasks such as full-waveform inversion (FWI) and least-squares reverse time migration (LSRTM) as partial differential equation-constrained optimization problems. Due to the large-scale nature, gradient-based optimization algorithms are preferred in practice to update the model iteratively. Higher-order methods converge in fewer iterations but often require higher computational costs, more line-search steps, and bigger memory storage. A balance among these aspects has to be considered. We have conducted an evaluation using Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate the steepest-descent algorithm, which we innovatively treat as a fixed-point iteration. Independent of the unknown parameter dimensionality, the computational cost of implementing the method can be reduced to an extremely low dimensional least-squares problem. The cost can be further reduced by a low-rank update. We determine the theoretical connections and the differences between AA and other well-known optimization methods such as L-BFGS and the restarted generalized minimal residual method and compare their computational cost and memory requirements. Numerical examples of FWI and LSRTM applied to the Marmousi benchmark demonstrate the acceleration effects of AA. Compared with the steepest-descent method, AA can achieve faster convergence and can provide competitive results with some quasi-Newton methods, making it an attractive optimization strategy for seismic inversion. 
    more » « less
  2. ourier ptychographic microscopy enables gigapixel-scale imaging, with both large field-of-view and high resolution. Using a set of low-resolution images that are recorded under varying illumination angles, the goal is to computationally reconstruct high-resolution phase and amplitude images. To increase temporal resolution, one may use multiplexed measurements where the sample is illuminated simultaneously from a subset of the angles. In this paper, we develop an algorithm for Fourier ptychographic microscopy with such multiplexed illumination. Specifically, we consider gradient descent type updates and propose an analytical step size that ensures the convergence of the iterates to a stationary point. Furthermore, we propose an accelerated version of our algorithm (with the same step size) which significantly improves the convergence speed. We demonstrate that the practical performance of our algorithm is identical to the case where the step size is manually tuned. Finally, we apply our parameter-free approach to real data and validate its applicability. 
    more » « less
  3. We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a lowtubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework. We show the t-SVD is a specialization of the well-studied block-term decomposition for third-order tensors, and we present an algorithm under this model that can track changing free submodules from incomplete streaming 2-D data. The proposed algorithm uses principles from incremental gradient descent on the Grassmann manifold of subspaces to solve the tensor completion problem with linear complexity and constant memory in the number of time samples. We provide a local expected linear convergence result for our algorithm. Our empirical results are competitive in accuracy but much faster in compute time than state-of-the-art tensor completion algorithms on real applications to recover temporal chemo-sensing and MRI data under limited sampling. 
    more » « less
  4. We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our method is based on performing an approximate Newton step using a random projection of the Hessian. Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix. Leveraging this novel fundamental result, we design an algorithm with a sketch size proportional to the effective dimension and which exhibits a quadratic rate of convergence. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-theart sub-sampled Newton methods. However, in most practical cases, the effective dimension is not known beforehand, and this raises the question of how to pick a sketch size as small as the effective dimension while preserving a quadratic convergence rate. Our second and main contribution is thus to propose an adaptive sketch size algorithm with quadratic convergence rate and which does not require prior knowledge or estimation of the effective dimension: at each iteration, it starts with a small sketch size, and increases it until quadratic progress is achieved. Importantly, we show that the embedding dimension remains proportional to the effective dimension throughout the entire path and that our method achieves state-of-the-art computational complexity for solving convex optimization programs with a strongly convex component. We discuss and illustrate applications to linear and quadratic programming, as well as logistic regression and other generalized linear models. 
    more » « less
  5. Abstract

    Gradient‐type iterative methods for solving Hermitian eigenvalue problems can be accelerated by using preconditioning and deflation techniques. A preconditioned steepest descent iteration with implicit deflation (PSD‐id) is one of such methods. The convergence behavior of the PSD‐id is recently investigated based on the pioneering work of Samokish on the preconditioned steepest descent method (PSD). The resulting non‐asymptotic estimates indicate a superlinear convergence of the PSD‐id under strong assumptions on the initial guess. The present paper utilizes an alternative convergence analysis of the PSD by Neymeyr under much weaker assumptions. We embed Neymeyr's approach into the analysis of the PSD‐id using a restricted formulation of the PSD‐id. More importantly, we extend the new convergence analysis of the PSD‐id to a practically preferred block version of the PSD‐id, or BPSD‐id, and show the cluster robustness of the BPSD‐id. Numerical examples are provided to validate the theoretical estimates.

     
    more » « less