skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Friday, November 15 until 2:00 AM ET on Saturday, November 16 due to maintenance. We apologize for the inconvenience.


Title: Reasoning about Uncertainties in Discrete-Time Dynamical Systems using Polynomial Forms
In this paper, we propose polynomial forms to represent distributions of state variables over time for discrete-time stochastic dynamical systems. This problem arises in a variety of applications in areas ranging from biology to robotics. Our approach allows us to rigorously represent the probability distribution of state variables over time, and provide guaranteed bounds on the expectations, moments and probabilities of tail events involving the state variables. First we recall ideas from interval arithmetic, and use them to rigorously represent the state variables at time t as a function of the initial state variables and noise symbols that model the random exogenous inputs encountered before time t. Next we show how concentration of measure inequalities can be employed to prove rigorous bounds on the tail probabilities of these state variables. We demonstrate interesting applications that demonstrate how our approach can be useful in some situations to establish mathematically guaranteed bounds that are of a different nature from those obtained through simulations with pseudo-random numbers.  more » « less
Award ID(s):
1836900
NSF-PAR ID:
10284962
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Advances in Neural Information Processing System (NeurIPS)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we propose polynomial forms to represent distributions of state variables over time for discrete-time stochastic dynamical systems. This problem arises in a variety of applications in areas ranging from biology to robotics. Our approach allows us to rigorously represent the probability distribution of state variables over time, and provide guaranteed bounds on the expectations, moments and probabilities of tail events involving the state variables. First, we recall ideas from interval arithmetic, and use them to rigorously represent the state variables at time t as a function of the initial state variables and noise symbols that model the random exogenous inputs encountered before time t. Next, we show how concentration of measure inequalities can be employed to prove rigorous bounds on the tail probabilities of these state variables. We demonstrate interesting applications that demonstrate how our approach can be useful in some situations to establish mathematically guaranteed bounds that are of a different nature from those obtained through simulations with pseudo-random numbers. 
    more » « less
  2. The non‐asymptotic tail bounds of random variables play crucial roles in probability, statistics, and machine learning. Despite much success in developing upper bounds on tail probabilities in literature, the lower bounds on tail probabilities are relatively fewer. In this paper, we introduce systematic and user‐friendly schemes for developing non‐asymptotic lower bounds of tail probabilities. In addition, we develop sharp lower tail bounds for the sum of independent sub‐Gaussian and sub‐exponential random variables, which match the classic Hoeffding‐type and Bernstein‐type concentration inequalities, respectively. We also provide non‐asymptotic matching upper and lower tail bounds for a suite of distributions, including gamma, beta, (regular, weighted, and noncentral) chi‐square, binomial, Poisson, Irwin–Hall, etc. We apply the result to establish the matching upper and lower bounds for extreme value expectation of the sum of independent sub‐Gaussian and sub‐exponential random variables. A statistical application of signal identification from sparse heterogeneous mixtures is finally considered.

     
    more » « less
  3. In general, obtaining the exact steady-state distribution of queue lengths is not feasible. Therefore, we focus on establishing bounds for the tail probabilities of queue lengths. We examine queueing systems under Heavy Traffic (HT) conditions and provide exponentially decaying bounds for the probability P(∈q > x), where ∈ is the HT parameter denoting how far the load is from the maximum allowed load. Our bounds are not limited to asymptotic cases and are applicable even for finite values of ∈, and they get sharper as ∈ - 0. Consequently, we derive non-asymptotic convergence rates for the tail probabilities. Furthermore, our results offer bounds on the exponential rate of decay of the tail, given by -1/2 log P(∈q > x) for any finite value of x. These can be interpreted as non-asymptotic versions of Large Deviation (LD) results. To obtain our results, we use an exponential Lyapunov function to bind the moment-generating function of queue lengths and apply Markov's inequality. We demonstrate our approach by presenting tail bounds for a continuous time Join-the-shortest queue (JSQ) system.

     
    more » « less
  4. We propose new differential privacy solutions for when external invariants and integer constraints are simultaneously enforced on the data product. These requirements arise in real world applications of private data curation, including the public release of the 2020 U.S. Decennial Census. They pose a great challenge to the production of provably private data products with adequate statistical usability. We propose integer subspace differential privacy to rigorously articulate the privacy guarantee when data products maintain both the invariants and integer characteristics, and demonstrate the composition and post-processing properties of our proposal. To address the challenge of sampling from a potentially highly restricted discrete space, we devise a pair of unbiased additive mechanisms, the generalized Laplace and the generalized Gaussian mechanisms, by solving the Diophantine equations as defined by the constraints. The proposed mechanisms have good accuracy, with errors exhibiting sub-exponential and sub-Gaussian tail probabilities respectively. To implement our proposal, we design an MCMC algorithm and supply empirical convergence assessment using estimated upper bounds on the total variation distance via L-lag coupling. We demonstrate the efficacy of our proposal with applications to a synthetic problem with intersecting invariants, a sensitive contingency table with known margins, and the 2010 Census county-level demonstration data with mandated fixed state population totals. 
    more » « less
  5. We consider the problem of estimating the mean of a sequence of random elements f (θ, X_1) , . . . , f (θ, X_n) where f is a fixed scalar function, S = (X_1, . . . , X_n) are independent random variables, and θ is a possibly S-dependent parameter. An example of such a problem would be to estimate the generalization error of a neural network trained on n examples where f is a loss function. Classically, this problem is approached through concentration inequalities holding uniformly over compact parameter sets of functions f , for example as in Rademacher or VC type analysis. However, in many problems, such inequalities often yield numerically vacuous estimates. Recently, the PAC-Bayes framework has been proposed as a better alternative for this class of problems for its ability to often give numerically non-vacuous bounds. In this paper, we show that we can do even better: we show how to refine the proof strategy of the PAC-Bayes bounds and achieve even tighter guarantees. Our approach is based on the coin-betting framework that derives the numerically tightest known time-uniform concentration inequalities from the regret guarantees of online gambling algorithms. In particular, we derive the first PAC-Bayes concentration inequality based on the coin-betting approach that holds simultaneously for all sample sizes. We demonstrate its tightness showing that by relaxing it we obtain a number of previous results in a closed form including Bernoulli-KL and empirical Bernstein inequalities. Finally, we propose an efficient algorithm to numerically calculate confidence sequences from our bound, which often generates nonvacuous confidence bounds even with one sample, unlike the state-of-the-art PAC-Bayes bounds. 
    more » « less