In this paper, we study multistage stochastic mixed-integer nonlinear programs (MS-MINLP). This general class of problems encompasses, as important special cases, multistage stochastic convex optimization with
This content will become publicly available on December 1, 2025
Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as
- PAR ID:
- 10506504
- Publisher / Repository:
- Nature Portfolio
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 15
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract non-Lipschitzian value functions and multistage stochastic mixed-integer linear optimization. We develop stochastic dual dynamic programming (SDDP) type algorithms with nested decomposition, deterministic sampling, and stochastic sampling. The key ingredient is a new type of cuts based on generalized conjugacy. Several interesting classes of MS-MINLP are identified, where the new algorithms are guaranteed to obtain the global optimum without the assumption of complete recourse. This significantly generalizes the classic SDDP algorithms. We also characterize the iteration complexity of the proposed algorithms. In particular, for a -stage stochastic MINLP satisfying$$(T+1)$$ L -exact Lipschitz regularization withd -dimensional state spaces, to obtain an -optimal root node solution, we prove that the number of iterations of the proposed deterministic sampling algorithm is upper bounded by$$\varepsilon $$ , and is lower bounded by$${\mathcal {O}}((\frac{2LT}{\varepsilon })^d)$$ for the general case or by$${\mathcal {O}}((\frac{LT}{4\varepsilon })^d)$$ for the convex case. This shows that the obtained complexity bounds are rather sharp. It also reveals that the iteration complexity depends$${\mathcal {O}}((\frac{LT}{8\varepsilon })^{d/2-1})$$ polynomially on the number of stages. We further show that the iteration complexity dependslinearly onT , if all the state spaces are finite sets, or if we seek a -optimal solution when the state spaces are infinite sets, i.e. allowing the optimality gap to scale with$$(T\varepsilon )$$ T . To the best of our knowledge, this is the first work that reports global optimization algorithms as well as iteration complexity results for solving such a large class of multistage stochastic programs. The iteration complexity study resolves a conjecture by the late Prof. Shabbir Ahmed in the general setting of multistage stochastic mixed-integer optimization. -
Abstract We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in
and other complexity classes do not have small circuits (in the worst case and/or on average) from various circuit classes$\mathsf {Quasi}\text {-}\mathsf {NP} = \mathsf {NTIME}[n^{(\log n)^{O(1)}}]$ , by showing that$\mathcal { C}$ admits non-trivial satisfiability and/or$\mathcal { C}$ # SAT algorithms which beat exhaustive search by a minor amount. In this paper, we present a new strong lower bound consequence of having a non-trivial# SAT algorithm for a circuit class . Say that a symmetric Boolean function${\mathcal C}$ f (x 1,…,x n ) issparse if it outputs 1 onO (1) values of . We show that for every sparse${\sum }_{i} x_{i}$ f , and for all “typical” , faster$\mathcal { C}$ # SAT algorithms for circuits imply lower bounds against the circuit class$\mathcal { C}$ , which may be$f \circ \mathcal { C}$ stronger than itself. In particular:$\mathcal { C}$ # SAT algorithms forn k -size -circuits running in 2$\mathcal { C}$ n /n k time (for allk ) implyN E X P does not have -circuits of polynomial size.$(f \circ \mathcal { C})$ # SAT algorithms for -size$2^{n^{{\varepsilon }}}$ -circuits running in$\mathcal { C}$ time (for some$2^{n-n^{{\varepsilon }}}$ ε > 0) implyQ u a s i -N P does not have -circuits of polynomial size.$(f \circ \mathcal { C})$ Applying
# SAT algorithms from the literature, one immediate corollary of our results is thatQ u a s i -N P does not haveE M A J ∘A C C 0∘T H R circuits of polynomial size, whereE M A J is the “exact majority” function, improving previous lower bounds againstA C C 0[Williams JACM’14] andA C C 0∘T H R [Williams STOC’14], [Murray-Williams STOC’18]. This is the first nontrivial lower bound against such a circuit class. -
Abstract A well-known open problem of Meir and Moser asks if the squares of sidelength 1/
n for can be packed perfectly into a rectangle of area$$n\ge 2$$ . In this paper we show that for any$$\sum _{n=2}^\infty n^{-2}=\pi ^2/6-1$$ , and any$$1/2 that is sufficiently large depending on$$n_0$$ t , the squares of sidelength for$$n^{-t}$$ can be packed perfectly into a square of area$$n\ge n_0$$ . This was previously known (if one packs a rectangle instead of a square) for$$\sum _{n=n_0}^\infty n^{-2t}$$ (in which case one can take$$1/2 ).$$n_0=1$$ -
Abstract Sequence mappability is an important task in genome resequencing. In the (
k ,m )-mappability problem, for a given sequenceT of lengthn , the goal is to compute a table whosei th entry is the number of indices such that the length-$$j \ne i$$ m substrings ofT starting at positionsi andj have at mostk mismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of . We present several efficient algorithms for the general case of the problem. Our main result is an algorithm that, for$$k=1$$ , works in$$k=O(1)$$ space and, with high probability, in$$O(n)$$ time. Our algorithm requires a careful adaptation of the$$O(n \cdot \min \{m^k,\log ^k n\})$$ k -errata trees of Cole et al. [STOC 2004] to avoid multiple counting of pairs of substrings. Our technique can also be applied to solve the all-pairs Hamming distance problem introduced by Crochemore et al. [WABI 2017]. We further develop -time algorithms to compute$$O(n^2)$$ all (k ,m )-mappability tables for a fixedm and all or a fixed$$k\in \{0,\ldots ,m\}$$ k and all . Finally, we show that, for$$m\in \{k,\ldots ,n\}$$ , the ($$k,m = \Theta (\log n)$$ k ,m )-mappability problem cannot be solved in strongly subquadratic time unless the Strong Exponential Time Hypothesis fails. This is an improved and extended version of a paper presented at SPIRE 2018. -
Abstract We prove that
-depth local random quantum circuits with two qudit nearest-neighbor gates on a$${{\,\textrm{poly}\,}}(t) \cdot n^{1/D}$$ D -dimensional lattice withn qudits are approximatet -designs in various measures. These include the “monomial” measure, meaning that the monomials of a random circuit from this family have expectation close to the value that would result from the Haar measure. Previously, the best bound was due to Brandão–Harrow–Horodecki (Commun Math Phys 346(2):397–434, 2016) for$${{\,\textrm{poly}\,}}(t)\cdot n$$ . We also improve the “scrambling” and “decoupling” bounds for spatially local random circuits due to Brown and Fawzi (Scrambling speed of random quantum circuits, 2012). One consequence of our result is that assuming the polynomial hierarchy ($$D=1$$ ) is infinite and that certain counting problems are$${{\,\mathrm{\textsf{PH}}\,}}$$ -hard “on average”, sampling within total variation distance from these circuits is hard for classical computers. Previously, exact sampling from the outputs of even constant-depth quantum circuits was known to be hard for classical computers under these assumptions. However the standard strategy for extending this hardness result to approximate sampling requires the quantum circuits to have a property called “anti-concentration”, meaning roughly that the output has near-maximal entropy. Unitary 2-designs have the desired anti-concentration property. Our result improves the required depth for this level of anti-concentration from linear depth to a sub-linear value, depending on the geometry of the interactions. This is relevant to a recent experiment by the Google Quantum AI group to perform such a sampling task with 53 qubits on a two-dimensional lattice (Arute in Nature 574(7779):505–510, 2019; Boixo et al. in Nate Phys 14(6):595–600, 2018) (and related experiments by USTC), and confirms their conjecture that$$\#{\textsf{P}}$$ depth suffices for anti-concentration. The proof is based on a previous construction of$$O(\sqrt{n})$$ t -designs by Brandão et al. (2016), an analysis of how approximate designs behave under composition, and an extension of the quasi-orthogonality of permutation operators developed by Brandão et al. (2016). Different versions of the approximate design condition correspond to different norms, and part of our contribution is to introduce the norm corresponding to anti-concentration and to establish equivalence between these various norms for low-depth circuits. For random circuits with long-range gates, we use different methods to show that anti-concentration happens at circuit size corresponding to depth$$O(n\ln ^2 n)$$ . We also show a lower bound of$$O(\ln ^3 n)$$ for the size of such circuit in this case. We also prove that anti-concentration is possible in depth$$\Omega (n \ln n)$$ (size$$O(\ln n \ln \ln n)$$ ) using a different model.$$O(n \ln n \ln \ln n)$$