We show that any memory-constrained, first-order algorithm which minimizes d-dimensional, 1-Lipschitz convex functions over the unit ball to 1/ poly(d) accuracy using at most d^(1.25-delta) bits of memory must make at least d^(1+ 4 delta / 3) first-order queries (for any constant delta in (0,1/4). Consequently, the performance of such memory-constrained algorithms are a polynomial factor worse than the optimal O(d polylog d) query bound for this problem obtained by cutting plane methods that use >d^2 memory. This resolves one of the open problems in the COLT 2019 open problem publication of Woodworth and Srebro.
more »
« less
Efficient Convex Optimization Requires Superlinear Memory
We show that any memory-constrained, first-order algorithm which minimizes d-dimensional, 1-Lipschitz convex functions over the unit ball to 1/poly(d) accuracy using at most $$d^{1.25-\delta}$$ bits of memory must make at least $$\tilde{Omega}(d^{1+(4/3)\delta})$$ first-order queries (for any constant $$\delta in [0,1/4]$$). Consequently, the performance of such memory-constrained algorithms are a polynomial factor worse than the optimal $$\tilde{O}(d)$$ query bound for this problem obtained by cutting plane methods that use $$\tilde{O}(d^2)$$ memory. This resolves a COLT 2019 open problem of Woodworth and Srebro.
more »
« less
- PAR ID:
- 10564954
- Publisher / Repository:
- Journal of the ACM
- Date Published:
- Journal Name:
- Journal of the ACM
- Volume:
- 71
- Issue:
- 6
- ISSN:
- 0004-5411
- Page Range / eLocation ID:
- 1 to 37
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract We propose a heuristic algorithm to solve the underlying hard problem of the CSIDH cryptosystem (and other isogeny-based cryptosystems using elliptic curves with endomorphism ring isomorphic to an imaginary quadratic order 𝒪). Let Δ = Disc(𝒪) (in CSIDH, Δ = −4 p for p the security parameter). Let 0 < α < 1/2, our algorithm requires: A classical circuit of size 2 O ˜ log ( | Δ | ) 1 − α . $$2^{\tilde{O}\left(\log(|\Delta|)^{1-\alpha}\right)}.$$ A quantum circuit of size 2 O ˜ log ( | Δ | ) α . $$2^{\tilde{O}\left(\log(|\Delta|)^{\alpha}\right)}.$$ Polynomial classical and quantum memory. Essentially, we propose to reduce the size of the quantum circuit below the state-of-the-art complexity 2 O ˜ log ( | Δ | ) 1 / 2 $$2^{\tilde{O}\left(\log(|\Delta|)^{1/2}\right)}$$ at the cost of increasing the classical circuit-size required. The required classical circuit remains subexponential, which is a superpolynomial improvement over the classical state-of-the-art exponential solutions to these problems. Our method requires polynomial memory, both classical and quantum.more » « less
-
We consider the problem of performing linear regression over a stream of d-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples (a_1,b_1), (a_2,b_2)..., with a_i drawn independently from a d-dimensional isotropic Gaussian, and where b_i = + \eta_i, for a fixed x in R^d with ||x||= 1 and with independent noise \eta_i drawn uniformly from the interval [-2^{-d/5},2^{-d/5}]. We show that any algorithm with at most d^2/4 bits of memory requires at least \Omega(d \log \log \frac{1}{\epsilon}) samples to approximate x to \ell_2 error \epsilon with probability of success at least 2/3, for \epsilon sufficiently small as a function of d. In contrast, for such \epsilon, x can be recovered to error \epsilon with probability 1-o(1) with memory O\left(d^2 \log(1/\epsilon)\right) using d examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.more » « less
-
null (Ed.)We present an $$\tilde O(m+n^{1.5})$$-time randomized algorithm for maximum cardinality bipartite matching and related problems (e.g. transshipment, negative-weight shortest paths, and optimal transport) on $$m$$-edge, $$n$$-node graphs. For maximum cardinality bipartite matching on moderately dense graphs, i.e. $$m = \Omega(n^{1.5})$$, our algorithm runs in time nearly linear in the input size and constitutes the first improvement over the classic $$O(m\sqrt{n})$$-time [Dinic 1970; Hopcroft-Karp 1971; Karzanov 1973] and $$\tilde O(n^\omega)$$-time algorithms [Ibarra-Moran 1981] (where currently $$\omega\approx 2.373$$). On sparser graphs, i.e. when $$m = n^{9/8 + \delta}$$ for any constant $$\delta>0$$, our result improves upon the recent advances of [Madry 2013] and [Liu-Sidford 2020b, 2020a] which achieve an $$\tilde O(m^{4/3+o(1)})$$ runtime. We obtain these results by combining and advancing recent lines of research in interior point methods (IPMs) and dynamic graph algorithms. First, we simplify and improve the IPM of [v.d.Brand-Lee-Sidford-Song 2020], providing a general primal-dual IPM framework and new sampling-based techniques for handling infeasibility induced by approximate linear system solvers. Second, we provide a simple sublinear-time algorithm for detecting and sampling high-energy edges in electric flows on expanders and show that when combined with recent advances in dynamic expander decompositions, this yields efficient data structures for maintaining the iterates of both [v.d.Brand~et~al.] and our new IPMs. Combining this general machinery yields a simpler $$\tilde O(n \sqrt{m})$$ time algorithm for matching based on the logarithmic barrier function, and our state-of-the-art $$\tilde O(m+n^{1.5})$$ time algorithm for matching based on the [Lee-Sidford 2014] barrier (as regularized in [v.d.Brand~et~al.]).more » « less
-
This paper focuses on showing time-message trade-offs in distributed algorithms for fundamental problems such as leader election, broadcast, spanning tree (ST), minimum spanning tree (MST), minimum cut, and many graph verification problems. We consider the synchronous CONGEST distributed computing model and assume that each node has initial knowledge of itself and the identifiers of its neighbors - the so-called KT_1 model - a well-studied model that also naturally arises in many applications. Recently, it has been established that one can obtain (almost) singularly optimal algorithms, i.e., algorithms that have simultaneously optimal time and message complexity (up to polylogarithmic factors), for many fundamental problems in the standard KT_0 model (where nodes have only local knowledge of themselves and not their neighbors). The situation is less clear in the KT_1 model. In this paper, we present several new distributed algorithms in the KT_1 model that trade off between time and message complexity. Our distributed algorithms are based on a uniform and general approach which involves constructing a sparsified spanning subgraph of the original graph - called a danner - that trades off the number of edges with the diameter of the sparsifier. In particular, a key ingredient of our approach is a distributed randomized algorithm that, given a graph G and any delta in [0,1], with high probability constructs a danner that has diameter O~(D + n^{1-delta}) and O~(min{m,n^{1+delta}}) edges in O~(n^{1-delta}) rounds while using O~(min{m,n^{1+delta}}) messages, where n, m, and D are the number of nodes, edges, and the diameter of G, respectively. Using our danner construction, we present a family of distributed randomized algorithms for various fundamental problems that exhibit a trade-off between message and time complexity and that improve over previous results. Specifically, we show the following results (all hold with high probability) in the KT_1 model, which subsume and improve over prior bounds in the KT_1 model (King et al., PODC 2014 and Awerbuch et al., JACM 1990) and the KT_0 model (Kutten et al., JACM 2015, Pandurangan et al., STOC 2017 and Elkin, PODC 2017): 1) Leader Election, Broadcast, and ST. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,1]. 2) MST and Connectivity. These problems can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. In particular, for delta = 0.5 we obtain a distributed MST algorithm that runs in optimal O~(D+sqrt{n}) rounds and uses O~(min{m,n^{3/2}}) messages. We note that this improves over the singularly optimal algorithm in the KT_0 model that uses O~(D+sqrt{n}) rounds and O~(m) messages. 3) Minimum Cut. O(log n)-approximate minimum cut can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5]. 4) Graph Verification Problems such as Bipartiteness, Spanning Subgraph etc. These can be solved in O~(D+n^{1-delta}) rounds using O~(min{m,n^{1+delta}}) messages for any delta in [0,0.5].more » « less
An official website of the United States government

