 Award ID(s):
 1836900
 NSFPAR ID:
 10284962
 Date Published:
 Journal Name:
 Advances in Neural Information Processing System (NeurIPS)
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

In this paper, we propose polynomial forms to represent distributions of state variables over time for discretetime stochastic dynamical systems. This problem arises in a variety of applications in areas ranging from biology to robotics. Our approach allows us to rigorously represent the probability distribution of state variables over time, and provide guaranteed bounds on the expectations, moments and probabilities of tail events involving the state variables. First, we recall ideas from interval arithmetic, and use them to rigorously represent the state variables at time t as a function of the initial state variables and noise symbols that model the random exogenous inputs encountered before time t. Next, we show how concentration of measure inequalities can be employed to prove rigorous bounds on the tail probabilities of these state variables. We demonstrate interesting applications that demonstrate how our approach can be useful in some situations to establish mathematically guaranteed bounds that are of a different nature from those obtained through simulations with pseudorandom numbers.more » « less

The non‐asymptotic tail bounds of random variables play crucial roles in probability, statistics, and machine learning. Despite much success in developing upper bounds on tail probabilities in literature, the lower bounds on tail probabilities are relatively fewer. In this paper, we introduce systematic and user‐friendly schemes for developing non‐asymptotic lower bounds of tail probabilities. In addition, we develop sharp lower tail bounds for the sum of independent sub‐Gaussian and sub‐exponential random variables, which match the classic Hoeffding‐type and Bernstein‐type concentration inequalities, respectively. We also provide non‐asymptotic matching upper and lower tail bounds for a suite of distributions, including gamma, beta, (regular, weighted, and noncentral) chi‐square, binomial, Poisson, Irwin–Hall, etc. We apply the result to establish the matching upper and lower bounds for extreme value expectation of the sum of independent sub‐Gaussian and sub‐exponential random variables. A statistical application of signal identification from sparse heterogeneous mixtures is finally considered.

In general, obtaining the exact steadystate distribution of queue lengths is not feasible. Therefore, we focus on establishing bounds for the tail probabilities of queue lengths. We examine queueing systems under Heavy Traffic (HT) conditions and provide exponentially decaying bounds for the probability P(∈q > x), where ∈ is the HT parameter denoting how far the load is from the maximum allowed load. Our bounds are not limited to asymptotic cases and are applicable even for finite values of ∈, and they get sharper as ∈  0. Consequently, we derive nonasymptotic convergence rates for the tail probabilities. Furthermore, our results offer bounds on the exponential rate of decay of the tail, given by 1/2 log P(∈q > x) for any finite value of x. These can be interpreted as nonasymptotic versions of Large Deviation (LD) results. To obtain our results, we use an exponential Lyapunov function to bind the momentgenerating function of queue lengths and apply Markov's inequality. We demonstrate our approach by presenting tail bounds for a continuous time Jointheshortest queue (JSQ) system.

We propose new differential privacy solutions for when external invariants and integer constraints are simultaneously enforced on the data product. These requirements arise in real world applications of private data curation, including the public release of the 2020 U.S. Decennial Census. They pose a great challenge to the production of provably private data products with adequate statistical usability. We propose integer subspace differential privacy to rigorously articulate the privacy guarantee when data products maintain both the invariants and integer characteristics, and demonstrate the composition and postprocessing properties of our proposal. To address the challenge of sampling from a potentially highly restricted discrete space, we devise a pair of unbiased additive mechanisms, the generalized Laplace and the generalized Gaussian mechanisms, by solving the Diophantine equations as defined by the constraints. The proposed mechanisms have good accuracy, with errors exhibiting subexponential and subGaussian tail probabilities respectively. To implement our proposal, we design an MCMC algorithm and supply empirical convergence assessment using estimated upper bounds on the total variation distance via Llag coupling. We demonstrate the efficacy of our proposal with applications to a synthetic problem with intersecting invariants, a sensitive contingency table with known margins, and the 2010 Census countylevel demonstration data with mandated fixed state population totals.more » « less

We consider the problem of estimating the mean of a sequence of random elements f (θ, X_1) , . . . , f (θ, X_n) where f is a fixed scalar function, S = (X_1, . . . , X_n) are independent random variables, and θ is a possibly Sdependent parameter. An example of such a problem would be to estimate the generalization error of a neural network trained on n examples where f is a loss function. Classically, this problem is approached through concentration inequalities holding uniformly over compact parameter sets of functions f , for example as in Rademacher or VC type analysis. However, in many problems, such inequalities often yield numerically vacuous estimates. Recently, the PACBayes framework has been proposed as a better alternative for this class of problems for its ability to often give numerically nonvacuous bounds. In this paper, we show that we can do even better: we show how to refine the proof strategy of the PACBayes bounds and achieve even tighter guarantees. Our approach is based on the coinbetting framework that derives the numerically tightest known timeuniform concentration inequalities from the regret guarantees of online gambling algorithms. In particular, we derive the first PACBayes concentration inequality based on the coinbetting approach that holds simultaneously for all sample sizes. We demonstrate its tightness showing that by relaxing it we obtain a number of previous results in a closed form including BernoulliKL and empirical Bernstein inequalities. Finally, we propose an efficient algorithm to numerically calculate confidence sequences from our bound, which often generates nonvacuous confidence bounds even with one sample, unlike the stateoftheart PACBayes bounds.more » « less