skip to main content


Title: EXPONENTIAL CONVERGENCE OF SOBOLEV GRADIENT DESCENT FOR A CLASS OF NONLINEAR EIGENPROBLEMS
WeproposetousetheL􏰀ojasiewiczinequalityasageneraltoolforanalyzingthecon- vergence rate of gradient descent on a Hilbert manifold, without resorting to the continuous gradient flow. Using this tool, we show that a Sobolev gradient descent method with adaptive inner product converges exponentially fast to the ground state for the Gross-Pitaevskii eigenproblem. This method can be extended to a class of general high-degree optimizations or nonlinear eigenproblems under cer- tain conditions. We demonstrate this generalization using several examples, in particular a nonlinear Schr ̈odinger eigenproblem with an extra high-order interaction term. Numerical experiments are pre- sented for these problems.  more » « less
Award ID(s):
1912654 1907977
NSF-PAR ID:
10354170
Author(s) / Creator(s):
Date Published:
Journal Name:
Communications in mathematical sciences
Volume:
20
Issue:
2
ISSN:
1539-6746
Page Range / eLocation ID:
377–403
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Purpose

    To develop an improved k‐space reconstruction method using scan‐specific deep learning that is trained on autocalibration signal (ACS) data.

    Theory

    Robust artificial‐neural‐networks for k‐space interpolation (RAKI) reconstruction trains convolutional neural networks on ACS data. This enables nonlinear estimation of missing k‐space lines from acquired k‐space data with improved noise resilience, as opposed to conventional linear k‐space interpolation‐based methods, such as GRAPPA, which are based on linear convolutional kernels.

    Methods

    The training algorithm is implemented using a mean square error loss function over the target points in the ACS region, using a gradient descent algorithm. The neural network contains 3 layers of convolutional operators, with 2 of these including nonlinear activation functions. The noise performance and reconstruction quality of the RAKI method was compared with GRAPPA in phantom, as well as in neurological and cardiac in vivo data sets.

    Results

    Phantom imaging shows that the proposed RAKI method outperforms GRAPPA at high (≥4) acceleration rates, both visually and quantitatively. Quantitative cardiac imaging shows improved noise resilience at high acceleration rates (rate 4:23% and rate 5:48%) over GRAPPA. The same trend of improved noise resilience is also observed in high‐resolution brain imaging at high acceleration rates.

    Conclusion

    The RAKI method offers a training database‐free deep learning approach for MRI reconstruction, with the potential to improve many existing reconstruction approaches, and is compatible with conventional data acquisition protocols.

     
    more » « less
  2. null (Ed.)
    Multi-task IRL recognizes that expert(s) could be switching between multiple ways of solving the same problem, or interleaving demonstrations of multiple tasks. The learner aims to learn the reward functions that individually guide these distinct ways. We present a new method for multi-task IRL that generalizes the well-known maximum entropy approach by combining it with a Dirichlet process based minimum entropy clustering of the observed data. This yields a single nonlinear optimization problem, called MinMaxEnt Multi-task IRL (MME-MTIRL), which can be solved using the Lagrangian relaxation and gradient descent methods. We evaluate MME- MTIRL on the robotic task of sorting onions on a processing line where the expert utilizes multiple ways of detecting and removing blemished onions. The method is able to learn the underlying reward functions to a high level of accuracy and it improves on the previous approaches. 
    more » « less
  3. A major challenge in many machine learning tasks is that the model expressive power depends on model size. Low-rank tensor methods are an efficient tool for handling the curse of dimensionality in many large-scale machine learning models. The major challenges in training a tensor learning model include how to process the high-volume data, how to determine the tensor rank automatically, and how to estimate the uncertainty of the results. While existing tensor learning focuses on a specific task, this paper proposes a generic Bayesian framework that can be employed to solve a broad class of tensor learning problems such as tensor completion, tensor regression, and tensorized neural networks. We develop a low-rank tensor prior for automatic rank determination in nonlinear problems. Our method is implemented with both stochastic gradient Hamiltonian Monte Carlo (SGHMC) and Stein Variational Gradient Descent (SVGD). We compare the automatic rank determination and uncertainty quantification of these two solvers. We demonstrate that our proposed method can determine the tensor rank automatically and can quantify the uncertainty of the obtained results. We validate our framework on tensor completion tasks and tensorized neural network training tasks. 
    more » « less
  4. null (Ed.)
    This paper is concerned with the inverse scattering problem which aims to determine the spatially distributed dielectric constant coefficient of the 2D Helmholtz equation from multifrequency backscatter data associated with a single direction of the incident plane wave. We propose a globally convergent convexification numerical algorithm to solve this nonlinear and ill-posed inverse problem. The key advantage of our method over conventional optimization approaches is that it does not require a good first guess about the solution. First, we eliminate the coefficient from the Helmholtz equation using a change of variables. Next, using a truncated expansion with respect to a special Fourier basis, we approximately reformulate the inverse problem as a system of quasilinear elliptic PDEs, which can be numerically solved by a weighted quasi-reversibility approach. The cost functional for the weighted quasi-reversibility method is constructed as a Tikhonov-like functional that involves a Carleman Weight Function. Our numerical study shows that, using a version of the gradient descent method, one can find the minimizer of this Tikhonov-like functional without any advanced a priori knowledge about it. 
    more » « less
  5. Wasserstein gradient flows provide a powerful means of understanding and solving many diffusion equations. Specifically, Fokker-Planck equations, which model the diffusion of probability measures, can be understood as gradient descent over entropy functionals in Wasserstein space. This equivalence, introduced by Jordan, Kinderlehrer and Otto, inspired the so-called JKO scheme to approximate these diffusion processes via an implicit discretization of the gradient flow in Wasserstein space. Solving the optimization problem associated with each JKO step, however, presents serious computational challenges. We introduce a scalable method to approximate Wasserstein gradient flows, targeted to machine learning applications. Our approach relies on input-convex neural networks (ICNNs) to discretize the JKO steps, which can be optimized by stochastic gradient descent. Contrarily to previous work, our method does not require domain discretization or particle simulation. As a result, we can sample from the measure at each time step of the diffusion and compute its probability density. We demonstrate the performance of our algorithm by computing diffusions following the Fokker-Planck equation and apply it to unnormalized density sampling as well as nonlinear filtering. 
    more » « less