skip to main content


Title: Anderson acceleration for seismic inversion
State-of-the-art seismic imaging techniques treat inversion tasks such as full-waveform inversion (FWI) and least-squares reverse time migration (LSRTM) as partial differential equation-constrained optimization problems. Due to the large-scale nature, gradient-based optimization algorithms are preferred in practice to update the model iteratively. Higher-order methods converge in fewer iterations but often require higher computational costs, more line-search steps, and bigger memory storage. A balance among these aspects has to be considered. We have conducted an evaluation using Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate the steepest-descent algorithm, which we innovatively treat as a fixed-point iteration. Independent of the unknown parameter dimensionality, the computational cost of implementing the method can be reduced to an extremely low dimensional least-squares problem. The cost can be further reduced by a low-rank update. We determine the theoretical connections and the differences between AA and other well-known optimization methods such as L-BFGS and the restarted generalized minimal residual method and compare their computational cost and memory requirements. Numerical examples of FWI and LSRTM applied to the Marmousi benchmark demonstrate the acceleration effects of AA. Compared with the steepest-descent method, AA can achieve faster convergence and can provide competitive results with some quasi-Newton methods, making it an attractive optimization strategy for seismic inversion.  more » « less
Award ID(s):
1913129
NSF-PAR ID:
10252867
Author(s) / Creator(s):
Date Published:
Journal Name:
GEOPHYSICS
Volume:
86
Issue:
1
ISSN:
0016-8033
Page Range / eLocation ID:
R99 to R108
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Full waveform inversion (FWI) and least-squares reverse time migration (LSRTM) are popular imaging techniques that can be solved as PDE-constrained optimization problems. Due to the large-scale nature, gradient- and Hessian-based optimization algorithms are preferred in practice to find the optimizer iteratively. However, a balance between the evaluation cost and the rate of convergence needs to be considered. We propose the use of Anderson acceleration (AA), a popular strategy to speed up the convergence of fixed-point iterations, to accelerate a gradient descent method. We show that AA can achieve fast convergence that provides competitive results with some quasi-Newton methods. Independent of the dimensionality of the unknown parameters, the computational cost of implementing the method can be reduced to an extremely lowdimensional least-squares problem, which makes AA an attractive method for seismic inversion. 
    more » « less
  2. null (Ed.)
    A computationally efficient one-shot approach with a low memory footprint is presented for unsteady optimization. The proposed technique is based on a novel and unique approach that combines local-in-time and fixed-point iteration methods to advance the unconverged primal and adjoint solutions forward and backward in time to evaluate the sensitivity of the globally time-integrated objective function. This is in some ways similar to the piggyback iterations in which primal and adjoint solutions are evaluated at a frozen design. During each cycle, the primal, adjoint, and design update problems are solved to advance the optimization problem. This new coupled approach is shown to provide significant savings in the memory footprint while reducing the computational cost of primal and adjoint evaluations per design cycle. The method is first applied to an inverse design problem for the unsteady lid-driven cavity. Following this, vortex suppression and mean drag reduction for a circular cylinder in crossflow is considered. Both of these objectives are achieved by optimizing the rotational speeds for steady or periodically oscillating excitations. For all cases presented in this work, the proposed technique is shown to provide significant reductions in memory as well as computational time. It is also shown that the unsteady optimization problem converges to the same optimal solution obtained using a conventional approach. 
    more » « less
  3. A computationally efficient "one-shot" approach with a low memory footprint is presented for unsteady design optimization. The proposed technique is based on a novel and unique approach that combines "local-in-time" and fixed-point iteration methods to advance the unconverged primal and adjoint solutions forward and backward in time to evaluate the sensitivity of the globally time-integrated objective function. This is in some ways similar to the "piggyback" iterations where primal and adjoint solutions are evaluated at a frozen design. During each cycle, the primal, adjoint, and design update problems are solved to advance the optimization problem. This new coupled approach is shown to provide significant savings in the memory footprint while reducing the computational cost of primal and adjoint evaluations per design cycle. The method is first applied to an inverse design problem for the unsteady lid-driven cavity. Following this, vortex suppression and mean drag reduction for a circular cylinder in cross-flow is considered. Both of these objectives are achieved by optimizing the rotational speeds for steady or periodically oscillating excitations. For all cases presented in this work, the proposed technique is shown to provide significant reductions in memory as well as computational time. It is also shown that the unsteady design optimization problem converges to the same optimal solution obtained using a conventional approach. 
    more » « less
  4. First-arrival traveltime tomography is an essential method for obtaining near-surface velocity models. The adjoint-state first-arrival traveltime tomography is appealing due to its straightforward implementation, low computational cost, and low memory consumption. Because solving the point-source isotropic eikonal equation by either ray tracers or eikonal solvers intrinsically corresponds to emanating discrete rays from the source point, the resulting traveltime gradient is singular at the source point, and we denote such a singular pattern the imprint of ray-illumination. Because the adjoint-state equation propagates traveltime residuals back to the source point according to the negative traveltime gradient, the resulting adjoint state will inherit such an imprint of ray-illumination, leading to singular gradient-descent directions when updating the velocity model in the adjoint-state traveltime tomography. To mitigate this imprint, we solve the adjoint-state equation twice but with different boundary conditions: one being taken to be regular data residuals and the other taken to be ones uniformly, so that we are able to use the latter adjoint state to normalize the regular adjoint state and we further use the normalized quantity to serve as the gradient direction to update the velocity model; we call this process ray-illumination compensation. To overcome the issue of limited aperture, we have developed a spatially varying regularization method to stabilize the new gradient direction. A synthetic example demonstrates that our method is able to mitigate the imprint of ray-illumination, remove the footprint effect near source points, and provide uniform velocity updates along raypaths. A complex example extracted from the Marmousi2 model and a migration example illustrate that the new method accurately recovers the velocity model and that an offset-dependent inversion strategy can further improve the quality of recovered velocity models. 
    more » « less
  5. Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning. Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost. In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach. Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily. We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting. Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed. 
    more » « less