skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: EXPONENTIAL CONVERGENCE OF SOBOLEV GRADIENT DESCENT FOR A CLASS OF NONLINEAR EIGENPROBLEMS
WeproposetousetheL􏰀ojasiewiczinequalityasageneraltoolforanalyzingthecon- vergence rate of gradient descent on a Hilbert manifold, without resorting to the continuous gradient flow. Using this tool, we show that a Sobolev gradient descent method with adaptive inner product converges exponentially fast to the ground state for the Gross-Pitaevskii eigenproblem. This method can be extended to a class of general high-degree optimizations or nonlinear eigenproblems under cer- tain conditions. We demonstrate this generalization using several examples, in particular a nonlinear Schr ̈odinger eigenproblem with an extra high-order interaction term. Numerical experiments are pre- sented for these problems.  more » « less
Award ID(s):
1912654 1907977
PAR ID:
10354170
Author(s) / Creator(s):
Date Published:
Journal Name:
Communications in mathematical sciences
Volume:
20
Issue:
2
ISSN:
1539-6746
Page Range / eLocation ID:
377–403
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    We study the relative power of learning with gradient descent on differentiable models, such as neural networks, versus using the corresponding tangent kernels. We show that under certain conditions, gradient descent achieves small error only if a related tangent kernel method achieves a non-trivial advantage over random guessing (a.k.a. weak learning), though this advantage might be very small even when gradient descent can achieve arbitrarily high accuracy. Complementing this, we show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing. 
    more » « less
  2. The discretized Bethe-Salpeter eigenvalue (BSE) problem arises in many body physics and quantum chemistry. Discretization leads to an {\color{black}algebraic} eigenvalue problem involving a matrix $$H\in \mathbb{C}^{2n\times 2n}$$ with a Hamiltonian-like structure. With proper transformations, {\color{black}the real BSE eigenproblem of form I and the complex BSE eigenproblem of form II} can be transformed into real product eigenvalue problems of order $$n$$ and $2n$, respectively. We propose a new variant of the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) {\color{black}enhanced with polynomial filters} to improve the robustness and effectiveness of a few well-known algorithms for computing the lowest eigenvalues of the product eigenproblems. {\color{black}Furthermore, our proposed method can be easily employed to solve large sparse standard symmetric eigenvalue problems.} We show that our ideal locally optimal algorithm delivers Rayleigh quotient approximation to the desired lowest eigenvalue that satisfies a global quasi-optimality, which is similar to the global optimality of the preconditioned conjugate gradient method for the iterative solution of a symmetric positive definite linear system. The robustness and efficiency of {\color{black}our proposed method} is illustrated by numerical experiments. 
    more » « less
  3. Finding diverse and representative Pareto solutions from the Pareto front is a key challenge in multi-objective optimization (MOO). In this work, we propose a novel gradient-based algorithm for profiling Pareto front by using Stein variational gradient descent (SVGD). We also provide a counterpart of our method based on Langevin dynamics. Our methods iteratively update a set of points in a parallel fashion to push them towards the Pareto front using multiple gradient descent, while encouraging the diversity between the particles by using the repulsive force mechanism in SVGD, or diffusion noise in Langevin dynamics. Compared with existing gradient-based methods that require predefined preference functions, our method can work efficiently in high dimensional problems, and can obtain more diverse solutions evenly distributed in the Pareto front. Moreover, our methods are theoretically guaranteed to converge to the Pareto front. We demonstrate the effectiveness of our method, especially the SVGD algorithm, through extensive experiments, showing its superiority over existing gradient-based algorithms. 
    more » « less
  4. We investigate a primal-dual (PD) method for the saddle point problem (SPP) that uses a linear approximation of the primal function instead of the standard proximal step, resulting in a linearized PD (LPD) method. For convex-strongly concave SPP, we observe that the LPD method has a suboptimal dependence on the Lipschitz constant of the primal function. To fix this issue, we combine features of Accelerated Gradient Descent with the LPD method resulting in a single-loop Accelerated Linearized Primal-Dual (ALPD) method. ALPD method achieves the optimal gradient complexity when the SPP has a semi-linear coupling function. We also present an inexact ALPD method for SPPs with a general nonlinear coupling function that maintains the optimal gradient evaluations of the primal parts and significantly improves the gradient evaluations of the coupling term compared to the ALPD method. We verify our findings with numerical experiments. 
    more » « less
  5. null (Ed.)
    Finding Nash equilibria in two-player zero-sum continuous games is a central problem in machine learning, e.g. for training both GANs and robust models. The existence of pure Nash equilibria requires strong conditions which are not typically met in practice. Mixed Nash equilibria exist in greater generality and may be found using mirror descent. Yet this approach does not scale to high dimensions. To address this limitation, we parametrize mixed strategies as mixtures of particles, whose positions and weights are updated using gradient descent-ascent. We study this dynamics as an interacting gradient flow over measure spaces endowed with the Wasserstein-Fisher-Rao metric. We establish global convergence to an approximate equilibrium for the related Langevin gradient-ascent dynamic. We prove a law of large numbers that relates particle dynamics to mean-field dynamics. Our method identifies mixed equilibria in high dimensions and is demonstrably effective for training mixtures of GANs. 
    more » « less