skip to main content

Title: Stochastic dual dynamic programming for multistage stochastic mixed-integer nonlinear optimization

In this paper, we study multistage stochastic mixed-integer nonlinear programs (MS-MINLP). This general class of problems encompasses, as important special cases, multistage stochastic convex optimization withnon-Lipschitzianvalue functions and multistage stochastic mixed-integer linear optimization. We develop stochastic dual dynamic programming (SDDP) type algorithms with nested decomposition, deterministic sampling, and stochastic sampling. The key ingredient is a new type of cuts based on generalized conjugacy. Several interesting classes of MS-MINLP are identified, where the new algorithms are guaranteed to obtain the global optimum without the assumption of complete recourse. This significantly generalizes the classic SDDP algorithms. We also characterize the iteration complexity of the proposed algorithms. In particular, for a$$(T+1)$$(T+1)-stage stochastic MINLP satisfyingL-exact Lipschitz regularization withd-dimensional state spaces, to obtain an$$\varepsilon $$ε-optimal root node solution, we prove that the number of iterations of the proposed deterministic sampling algorithm is upper bounded by$${\mathcal {O}}((\frac{2LT}{\varepsilon })^d)$$O((2LTε)d), and is lower bounded by$${\mathcal {O}}((\frac{LT}{4\varepsilon })^d)$$O((LT4ε)d)for the general case or by$${\mathcal {O}}((\frac{LT}{8\varepsilon })^{d/2-1})$$O((LT8ε)d/2-1)for the convex case. This shows that the obtained complexity bounds are rather sharp. It also reveals that the iteration complexity dependspolynomiallyon the number of stages. We further show that the iteration complexity dependslinearlyonT, if all the state spaces are finite sets, or if more » we seek a$$(T\varepsilon )$$(Tε)-optimal solution when the state spaces are infinite sets, i.e. allowing the optimality gap to scale withT. To the best of our knowledge, this is the first work that reports global optimization algorithms as well as iteration complexity results for solving such a large class of multistage stochastic programs. The iteration complexity study resolves a conjecture by the late Prof. Shabbir Ahmed in the general setting of multistage stochastic mixed-integer optimization.

« less
Publication Date:
Journal Name:
Mathematical Programming
Page Range or eLocation-ID:
p. 935-985
Springer Science + Business Media
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in$\mathsf {Quasi}\text {-}\mathsf {NP} = \mathsf {NTIME}[n^{(\log n)^{O(1)}}]$Quasi-NP=NTIME[n(logn)O(1)]and other complexity classes do not have small circuits (in the worst case and/or on average) from various circuit classes$\mathcal { C}$C, by showing that$\mathcal { C}$Cadmits non-trivial satisfiability and/or#SAT algorithms which beat exhaustive search by a minor amount. In this paper, we present a new strong lower bound consequence of having a non-trivial#SAT algorithm for a circuit class${\mathcal C}$C. Say that a symmetric Boolean functionf(x1,…,xn) issparseif it outputs 1 onO(1) values of${\sum }_{i} x_{i}$ixi. We show that for every sparsef, and for all “typical”$\mathcal { C}$C, faster#SAT algorithms for$\mathcal { C}$Ccircuits imply lower bounds against the circuit class$f \circ \mathcal { C}$fC, which may bestrongerthan$\mathcal { C}$Citself. In particular:

    #SAT algorithms fornk-size$\mathcal { C}$C-circuits running in 2n/nktime (for allk) implyNEXPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    #SAT algorithms for$2^{n^{{\varepsilon }}}$2nε-size$\mathcal { C}$C-circuits running in$2^{n-n^{{\varepsilon }}}$2nnεtime (for someε> 0) implyQuasi-NPdoes not have$(f \circ \mathcal { C})$(fC)-circuits of polynomial size.

    Applying#SAT algorithms from the literature, one immediate corollary of our results is thatQuasi-NPdoes not haveEMAJACC0THRcircuits of polynomialmore »size, whereEMAJis the “exact majority” function, improving previous lower bounds againstACC0[Williams JACM’14] andACC0THR[Williams STOC’14], [Murray-Williams STOC’18]. This is the first nontrivial lower bound against such a circuit class.

    « less
  2. Abstract

    We consider the problem of covering multiple submodular constraints. Given a finite ground setN, a weight function$$w: N \rightarrow \mathbb {R}_+$$w:NR+,rmonotone submodular functions$$f_1,f_2,\ldots ,f_r$$f1,f2,,froverNand requirements$$k_1,k_2,\ldots ,k_r$$k1,k2,,krthe goal is to find a minimum weight subset$$S \subseteq N$$SNsuch that$$f_i(S) \ge k_i$$fi(S)kifor$$1 \le i \le r$$1ir. We refer to this problem asMulti-Submod-Coverand it was recently considered by Har-Peled and Jones (Few cuts meet many point sets. CoRR.arxiv:abs1808.03260Har-Peled and Jones 2018) who were motivated by an application in geometry. Even with$$r=1$$r=1Multi-Submod-Covergeneralizes the well-known Submodular Set Cover problem (Submod-SC), and it can also be easily reduced toSubmod-SC. A simple greedy algorithm gives an$$O(\log (kr))$$O(log(kr))approximation where$$k = \sum _i k_i$$k=ikiand this ratio cannot be improved in the general case. In this paper, motivated by several concrete applications, we consider two ways to improve upon the approximation given by the greedy algorithm. First, we give a bicriteria approximation algorithm forMulti-Submod-Coverthat covers each constraint to within a factor of$$(1-1/e-\varepsilon )$$(1-1/e-ε)while incurring an approximation of$$O(\frac{1}{\epsilon }\log r)$$O(1ϵlogr)in the cost. Second, we consider the special case when each$$f_i$$fiis a obtained from a truncated coverage function and obtain an algorithm that generalizes previous work on partial set cover (Partial-SC), covering integer programs (CIPs) and multiple vertex cover constraintsmore »Bera et al. (Theoret Comput Sci 555:2–8 Bera et al. 2014). Both these algorithms are based on mathematical programming relaxations that avoid the limitations of the greedy algorithm. We demonstrate the implications of our algorithms and related ideas to several applications ranging from geometric covering problems to clustering with outliers. Our work highlights the utility of the high-level model and the lens of submodularity in addressing this class of covering problems.

    « less
  3. Abstract

    A long-standing problem in mathematical physics is the rigorous derivation of the incompressible Euler equation from Newtonian mechanics. Recently, Han-Kwan and Iacobelli (Proc Am Math Soc 149:3045–3061, 2021) showed that in the monokinetic regime, one can directly obtain the Euler equation from a system ofNparticles interacting in$${\mathbb {T}}^d$$Td,$$d\ge 2$$d2, via Newton’s second law through asupercritical mean-field limit. Namely, the coupling constant$$\lambda $$λin front of the pair potential, which is Coulombic, scales like$$N^{-\theta }$$N-θfor some$$\theta \in (0,1)$$θ(0,1), in contrast to the usual mean-field scaling$$\lambda \sim N^{-1}$$λN-1. Assuming$$\theta \in (1-\frac{2}{d(d+1)},1)$$θ(1-2d(d+1),1), they showed that the empirical measure of the system is effectively described by the solution to the Euler equation as$$N\rightarrow \infty $$N. Han-Kwan and Iacobelli asked if their range for$$\theta $$θwas optimal. We answer this question in the negative by showing the validity of the incompressible Euler equation in the limit$$N\rightarrow \infty $$Nfor$$\theta \in (1-\frac{2}{d},1)$$θ(1-2d,1). Our proof is based on Serfaty’s modulated-energy method, but compared to that of Han-Kwan and Iacobelli, crucially uses an improved “renormalized commutator” estimate to obtain the larger range for$$\theta $$θ. Additionally, we show that for$$\theta \le 1-\frac{2}{d}$$θ1-2d, one cannot, in general, expect convergence in the modulated energy notion of distance.

  4. Abstract

    The free multiplicative Brownian motion$$b_{t}$$btis the large-Nlimit of the Brownian motion on$$\mathsf {GL}(N;\mathbb {C}),$$GL(N;C),in the sense of$$*$$-distributions. The natural candidate for the large-Nlimit of the empirical distribution of eigenvalues is thus the Brown measure of$$b_{t}$$bt. In previous work, the second and third authors showed that this Brown measure is supported in the closure of a region$$\Sigma _{t}$$Σtthat appeared in the work of Biane. In the present paper, we compute the Brown measure completely. It has a continuous density$$W_{t}$$Wton$$\overline{\Sigma }_{t},$$Σ¯t,which is strictly positive and real analytic on$$\Sigma _{t}$$Σt. This density has a simple form in polar coordinates:$$\begin{aligned} W_{t}(r,\theta )=\frac{1}{r^{2}}w_{t}(\theta ), \end{aligned}$$Wt(r,θ)=1r2wt(θ),where$$w_{t}$$wtis an analytic function determined by the geometry of the region$$\Sigma _{t}$$Σt. We show also that the spectral measure of free unitary Brownian motion$$u_{t}$$utis a “shadow” of the Brown measure of$$b_{t}$$bt, precisely mirroring the relationship between the circular and semicircular laws. We develop several new methods, based on stochastic differential equations and PDE, to prove these results.

  5. Abstract

    The softmax policy gradient (PG) method, which performs gradient ascent under softmax policy parameterization, is arguably one of the de facto implementations of policy optimization in modern reinforcement learning. For$$\gamma $$γ-discounted infinite-horizon tabular Markov decision processes (MDPs), remarkable progress has recently been achieved towards establishing global convergence of softmax PG methods in finding a near-optimal policy. However, prior results fall short of delineating clear dependencies of convergence rates on salient parameters such as the cardinality of the state space$${\mathcal {S}}$$Sand the effective horizon$$\frac{1}{1-\gamma }$$11-γ, both of which could be excessively large. In this paper, we deliver a pessimistic message regarding the iteration complexity of softmax PG methods, despite assuming access to exact gradient computation. Specifically, we demonstrate that the softmax PG method with stepsize$$\eta $$ηcan take$$\begin{aligned} \frac{1}{\eta } |{\mathcal {S}}|^{2^{\Omega \big (\frac{1}{1-\gamma }\big )}} ~\text {iterations} \end{aligned}$$1η|S|2Ω(11-γ)iterationsto converge, even in the presence of a benign policy initialization and an initial state distribution amenable to exploration (so that the distribution mismatch coefficient is not exceedingly large). This is accomplished by characterizing the algorithmic dynamics over a carefully-constructed MDP containing only three actions. Our exponential lower bound hints at the necessity of carefully adjusting update rules or enforcing proper regularization inmore »accelerating PG methods.

    « less