skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, July 12 until 2:00 AM ET on Saturday, July 13 due to maintenance. We apologize for the inconvenience.

Title: Quantum algorithms for escaping from saddle points
We initiate the study of quantum algorithms for escaping from saddle points with provable guarantee. Given a function f : R n → R , our quantum algorithm outputs an ϵ -approximate second-order stationary point using O ~ ( log 2 ⁡ ( n ) / ϵ 1.75 ) queries to the quantum evaluation oracle (i.e., the zeroth-order oracle). Compared to the classical state-of-the-art algorithm by Jin et al. with O ~ ( log 6 ⁡ ( n ) / ϵ 1.75 ) queries to the gradient oracle (i.e., the first-order oracle), our quantum algorithm is polynomially better in terms of log ⁡ n and matches its complexity in terms of 1 / ϵ . Technically, our main contribution is the idea of replacing the classical perturbations in gradient descent methods by simulating quantum wave equations, which constitutes the improvement in the quantum query complexity with log ⁡ n factors for escaping from saddle points. We also show how to use a quantum gradient computation algorithm due to Jordan to replace the classical gradient queries by quantum evaluation queries with the same complexity. Finally, we also perform numerical experiments that support our theoretical findings.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Czumaj, Artur ; Dawar, Anuj ; Merelli, Emanuela (Ed.)
    Consider a set P ⊆ ℝ^d of n points, and a convex body C provided via a separation oracle. The task at hand is to decide for each point of P if it is in C using the fewest number of oracle queries. We show that one can solve this problem in two and three dimensions using O(⬡_P log n) queries, where ⬡_P is the largest subset of points of P in convex position. In 2D, we provide an algorithm which efficiently generates these adaptive queries. Furthermore, we show that in two dimensions one can solve this problem using O(⊚(P,C) log² n) oracle queries, where ⊚(P,C) is a lower bound on the minimum number of queries that any algorithm for this specific instance requires. Finally, we consider other variations on the problem, such as using the fewest number of queries to decide if C contains all points of P. As an application of the above, we show that the discrete geometric median of a point set P in ℝ² can be computed in O(n log² n (log n log log n + ⬡(P))) expected time. 
    more » « less
  2. Quantum computers can produce a quantum encoding of the solution of a system of differential equations exponentially faster than a classical algorithm can produce an explicit description. However, while high-precision quantum algorithms for linear ordinary differential equations are well established, the best previous quantum algorithms for linear partial differential equations (PDEs) have complexity p o l y ( 1 / ϵ ) , where ϵ is the error tolerance. By developing quantum algorithms based on adaptive-order finite difference methods and spectral methods, we improve the complexity of quantum algorithms for linear PDEs to be p o l y ( d , log ⁡ ( 1 / ϵ ) ) , where d is the spatial dimension. Our algorithms apply high-precision quantum linear system algorithms to systems whose condition numbers and approximation errors we bound. We develop a finite difference algorithm for the Poisson equation and a spectral algorithm for more general second-order elliptic equations. 
    more » « less
  3. We give two new quantum algorithms for solving semidefinite programs (SDPs) providing quantum speed-ups. We consider SDP instances with m constraint matrices, each of dimension n, rank at most r, and sparsity s. The first algorithm assumes an input model where one is given access to an oracle to the entries of the matrices at unit cost. We show that it has run time O~(s^2 (sqrt{m} epsilon^{-10} + sqrt{n} epsilon^{-12})), with epsilon the error of the solution. This gives an optimal dependence in terms of m, n and quadratic improvement over previous quantum algorithms (when m ~~ n). The second algorithm assumes a fully quantum input model in which the input matrices are given as quantum states. We show that its run time is O~(sqrt{m}+poly(r))*poly(log m,log n,B,epsilon^{-1}), with B an upper bound on the trace-norm of all input matrices. In particular the complexity depends only polylogarithmically in n and polynomially in r. We apply the second SDP solver to learn a good description of a quantum state with respect to a set of measurements: Given m measurements and a supply of copies of an unknown state rho with rank at most r, we show we can find in time sqrt{m}*poly(log m,log n,r,epsilon^{-1}) a description of the state as a quantum circuit preparing a density matrix which has the same expectation values as rho on the m measurements, up to error epsilon. The density matrix obtained is an approximation to the maximum entropy state consistent with the measurement data considered in Jaynes' principle from statistical mechanics. As in previous work, we obtain our algorithm by "quantizing" classical SDP solvers based on the matrix multiplicative weight update method. One of our main technical contributions is a quantum Gibbs state sampler for low-rank Hamiltonians, given quantum states encoding these Hamiltonians, with a poly-logarithmic dependence on its dimension, which is based on ideas developed in quantum principal component analysis. We also develop a "fast" quantum OR lemma with a quadratic improvement in gate complexity over the construction of Harrow et al. [Harrow et al., 2017]. We believe both techniques might be of independent interest. 
    more » « less
  4. Riemannian optimization has drawn a lot of attention due to its wide applications in practice. Riemannian stochastic first-order algorithms have been studied in the literature to solve large-scale machine learning problems over Riemannian manifolds. However, most of the existing Riemannian stochastic algorithms require the objective function to be differentiable, and they do not apply to the case where the objective function is nonsmooth. In this paper, we present two Riemannian stochastic proximal gradient methods for minimizing nonsmooth function over the Stiefel manifold. The two methods, named R-ProxSGD and R-ProxSPB, are generalizations of proximal SGD and proximal SpiderBoost in Euclidean setting to the Riemannian setting. Analysis on the incremental first-order oracle (IFO) complexity of the proposed algorithms is provided. Specifically, the R-ProxSPB algorithm finds an ϵ -stationary point with O(ϵ−3) IFOs in the online case, and O(n+n‾√ϵ−2) IFOs in the finite-sum case with n being the number of summands in the objective. Experimental results on online sparse PCA and robust low-rank matrix completion show that our proposed methods significantly outperform the existing methods that use Riemannian subgradient information. 
    more » « less
  5. Motivated by estimation of quantum noise models, we study the problem of learning a Pauli channel, or more generally the Pauli error rates of an arbitrary channel. By employing a novel reduction to the "Population Recovery" problem, we give an extremely simple algorithm that learns the Pauli error rates of an n -qubit channel to precision ϵ in ℓ ∞ using just O ( 1 / ϵ 2 ) log ⁡ ( n / ϵ ) applications of the channel. This is optimal up to the logarithmic factors. Our algorithm uses only unentangled state preparation and measurements, and the post-measurement classical runtime is just an O ( 1 / ϵ ) factor larger than the measurement data size. It is also impervious to a limited model of measurement noise where heralded measurement failures occur independently with probability ≤ 1 / 4 .We then consider the case where the noise channel is close to the identity, meaning that the no-error outcome occurs with probability 1 − η . In the regime of small η we extend our algorithm to achieve multiplicative precision 1 ± ϵ (i.e., additive precision ϵ η ) using just O ( 1 ϵ 2 η ) log ⁡ ( n / ϵ ) applications of the channel. 
    more » « less