skip to main content


Title: Distributionally Robust Optimization with Confidence Bands for Probability Density Functions
Distributionally robust optimization (DRO) has been introduced for solving stochastic programs in which the distribution of the random variables is unknown and must be estimated by samples from that distribution. A key element of DRO is the construction of the ambiguity set, which is a set of distributions that contains the true distribution with a high probability. Assuming that the true distribution has a probability density function, we propose a class of ambiguity sets based on confidence bands of the true density function. As examples, we consider the shape-restricted confidence bands and the confidence bands constructed with a kernel density estimation technique. The former allows us to incorporate the prior knowledge of the shape of the underlying density function (e.g., unimodality and monotonicity), and the latter enables us to handle multidimensional cases. Furthermore, we establish the convergence of the optimal value of DRO to that of the underlying stochastic program as the sample size increases. The DRO with our ambiguity set involves functional decision variables and infinitely many constraints. To address this challenge, we apply duality theory to reformulate the DRO to a finite-dimensional stochastic program, which is amenable to a stochastic subgradient scheme as a solution method.  more » « less
Award ID(s):
1845444
NSF-PAR ID:
10301870
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
INFORMS Journal on Optimization
ISSN:
2575-1484
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ubmodular functions have applications throughout machine learning, but in many settings, we do not have direct access to the underlying function f . We focus on stochastic functions that are given as an expectation of functions over a distribution P. In practice, we often have only a limited set of samples fi from P . The standard approach indirectly optimizes f by maximizing the sum of fi. However, this ignores generalization to the true (unknown) distribution. In this paper, we achieve better performance on the actual underlying function f by directly optimizing a combination of bias and variance. Algorith- mically, we accomplish this by showing how to carry out distributionally robust optimiza- tion (DRO) for submodular functions, pro- viding efficient algorithms backed by theoret- ical guarantees which leverage several novel contributions to the general theory of DRO. We also show compelling empirical evidence that DRO improves generalization to the un- known stochastic submodular function. 
    more » « less
  2. Distributionally robust optimization (DRO) is a powerful tool for decision making under uncertainty. It is particularly appealing because of its ability to leverage existing data. However, many practical problems call for decision- making with some auxiliary information, and DRO in the context of conditional distributions is not straightforward. We propose a conditional kernel distributionally robust optimiza- tion (CKDRO) method that enables robust decision making under conditional distributions through kernel DRO and the conditional mean operator in the reproducing kernel Hilbert space (RKHS). In particular, we consider problems where there is a correlation between the unknown variable y and an auxiliary observable variable x. Given past data of the two variables and a queried auxiliary variable, CKDRO represents the conditional distribution P(y|x) as the conditional mean operator in the RKHS space and quantifies the ambiguity set in the RKHS as well, which depends on the size of the dataset as well as the query point. To justify the use of RKHS, we demonstrate that the ambiguity set defined in RKHS can be viewed as a ball under a metric that is similar to the Wasserstein metric. The DRO is then dualized and solved via a finite dimensional convex program. The proposed CKDRO approach is applied to a generation scheduling problem and shows that the result of CKDRO is superior to common benchmarks in terms of quality and robustness. 
    more » « less
  3. In a chance constrained program (CCP), decision makers seek the best decision whose probability of violating the uncertainty constraints is within the prespecified risk level. As a CCP is often nonconvex and is difficult to solve to optimality, much effort has been devoted to developing convex inner approximations for a CCP, among which the conditional value-at-risk (CVaR) has been known to be the best for more than a decade. This paper studies and generalizes the ALSO-X, originally proposed by Ahmed, Luedtke, SOng, and Xie in 2017 , for solving a CCP. We first show that the ALSO-X resembles a bilevel optimization, where the upper-level problem is to find the best objective function value and enforce the feasibility of a CCP for a given decision from the lower-level problem, and the lower-level problem is to minimize the expectation of constraint violations subject to the upper bound of the objective function value provided by the upper-level problem. This interpretation motivates us to prove that when uncertain constraints are convex in the decision variables, ALSO-X always outperforms the CVaR approximation. We further show (i) sufficient conditions under which ALSO-X can recover an optimal solution to a CCP; (ii) an equivalent bilinear programming formulation of a CCP, inspiring us to enhance ALSO-X with a convergent alternating minimization method (ALSO-X+); and (iii) an extension of ALSO-X and ALSO-X+ to distributionally robust chance constrained programs (DRCCPs) under the ∞−Wasserstein ambiguity set. Our numerical study demonstrates the effectiveness of the proposed methods. 
    more » « less
  4. null (Ed.)
    We develop a simple Quantile Spacing (QS) method for accurate probabilistic estimation of one-dimensional entropy from equiprobable random samples, and compare it with the popular Bin-Counting (BC) and Kernel Density (KD) methods. In contrast to BC, which uses equal-width bins with varying probability mass, the QS method uses estimates of the quantiles that divide the support of the data generating probability density function (pdf) into equal-probability-mass intervals. And, whereas BC and KD each require optimal tuning of a hyper-parameter whose value varies with sample size and shape of the pdf, QS only requires specification of the number of quantiles to be used. Results indicate, for the class of distributions tested, that the optimal number of quantiles is a fixed fraction of the sample size (empirically determined to be ~0.25–0.35), and that this value is relatively insensitive to distributional form or sample size. This provides a clear advantage over BC and KD since hyper-parameter tuning is not required. Further, unlike KD, there is no need to select an appropriate kernel-type, and so QS is applicable to pdfs of arbitrary shape, including those with discontinuous slope and/or magnitude. Bootstrapping is used to approximate the sampling variability distribution of the resulting entropy estimate, and is shown to accurately reflect the true uncertainty. For the four distributional forms studied (Gaussian, Log-Normal, Exponential and Bimodal Gaussian Mixture), expected estimation bias is less than 1% and uncertainty is low even for samples of as few as 100 data points; in contrast, for KD the small sample bias can be as large as −10% and for BC as large as −50%. We speculate that estimating quantile locations, rather than bin-probabilities, results in more efficient use of the information in the data to approximate the underlying shape of an unknown data generating pdf. 
    more » « less
  5. Abstract

    This article proposes a novel distributionally robust optimization (DRO)‐based soft‐constrained model predictive control (MPC) framework to explicitly hedge against unknown external input terms in a linear state‐space system. Without a priori knowledge of the exact uncertainty distribution, this framework works with a lifted ambiguity set constructed using machine learning to incorporate the first‐order moment information. By adopting a linear performance measure and considering input and state constraints robustly with respect to a lifted support set, the DRO‐based MPC is reformulated as a robust optimization problem. The constraints are softened to ensure recursive feasibility. Theoretical results on optimality, feasibility, and stability are further discussed. Performance and computational efficiency of the proposed method are illustrated through motion control and building energy control systems, showing 18.3% less cost and 78.8% less constraint violations, respectively, while requiring one third of the CPU time compared to multi‐stage scenario based stochastic MPC.

     
    more » « less