In this article, we revisit the sequential sourcecoding framework to analyze fundamental performance limitations of discretetime stochastic control systems subject to feedback datarate constraints in finitetime horizon. The basis of our results is a new characterization of the lower bound on the minimum totalrate achieved by sequential codes subject to a total (across time) distortion constraint and a computational algorithm that allocates optimally the ratedistortion, for a given distortion level, at each instant of time and any fixed finitetime horizon. The idea behind this characterization facilitates the derivation of analytical , nonasymptotic , and finitedimensional lower and upper bounds in two controlrelated scenarios: a) A parallel timevarying Gauss–Markov process with identically distributed spatial components that are quantized and transmitted through a noiseless channel to a minimum meansquared error decoder; and b) a timevarying quantized linear quadratic Gaussian (LQG) closedloop control system, with identically distributed spatial components and with a random datarate allocation. Our nonasymptotic lower bound on the quantized LQG control problem reveals the absolute minimum datarates for (mean square) stability of our timevarying plant for any fixed finitetime horizon. We supplement our framework with illustrative simulation experiments.
more »
« less
Explicit MeanSquare Error Bounds for MonteCarlo and Linear Stochastic Approximation
This paper concerns error bounds for recursive equations subject to Markovian disturbances. Motivating examples abound within the fields of Markov chain Monte Carlo (MCMC) and Reinforcement Learning (RL), and many of these algorithms can be interpreted as special cases of stochastic approximation (SA). It is argued that it is not possible in general to obtain a Hoeffding bound on the error sequence, even when the underlying Markov chain is reversible and geometrically ergodic, such as the M/M/1 queue. This is motivation for the focus on mean square error bounds for parameter estimates. It is shown that mean square error achieves the optimal rate of $O(1/n)$, subject to conditions on the stepsize sequence. Moreover, the exact constants in the rate are obtained, which is of great value in algorithm design
more »
« less
 Award ID(s):
 1935389
 NSFPAR ID:
 10483723
 Publisher / Repository:
 Proceedings of Machine Learning Research
 Date Published:
 Journal Name:
 Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this


In this work, we revisit the generalization error of stochastic mirror descent for quadratically bounded losses studied in Telgarsky (2022). Quadratically bounded losses is a broad class of loss functions, capturing both Lipschitz and smooth functions, for both regression and classification problems. We study the high probability generalization for this class of losses on linear predictors in both realizable and nonrealizable cases when the data are sampled IID or from a Markov chain. The prior work relies on an intricate coupling argument between the iterates of the original problem and those projected onto a bounded domain. This approach enables blackbox application of concentration inequalities, but also leads to suboptimal guarantees due in part to the use of a union bound across all iterations. In this work, we depart significantly from the prior work of Telgarsky (2022), and introduce a novel approach for establishing high probability generalization guarantees. In contrast to the prior work, our work directly analyzes the moment generating function of a novel supermartingale sequence and leverages the structure of stochastic mirror descent. As a result, we obtain improved bounds in all aforementioned settings. Specifically, in the realizable case and nonrealizable case with lighttailed subGaussian data, we improve the bounds by a $\log T$ factor, matching the correct rates of $1/T$ and $1/\sqrt{T}$, respectively. In the more challenging case of heavytailed polynomial data, we improve the existing bound by a $\mathrm{poly}\ T$ factor.more » « less

In this work, we revisit the generalization error of stochastic mirror descent for quadratically bounded losses studied in Telgarsky (2022). Quadratically bounded losses is a broad class of loss functions, capturing both Lipschitz and smooth functions, for both regression and classification problems. We study the high probability generalization for this class of losses on linear predictors in both realizable and nonrealizable cases when the data are sampled IID or from a Markov chain. The prior work relies on an intricate coupling argument between the iterates of the original problem and those projected onto a bounded domain. This approach enables blackbox application of concentration inequalities, but also leads to suboptimal guarantees due in part to the use of a union bound across all iterations. In this work, we depart significantly from the prior work of Telgarsky (2022), and introduce a novel approach for establishing high probability generalization guarantees. In contrast to the prior work, our work directly analyzes the moment generating function of a novel supermartingale sequence and leverages the structure of stochastic mirror descent. As a result, we obtain improved bounds in all aforementioned settings. Specifically, in the realizable case and nonrealizable case with lighttailed subGaussian data, we improve the bounds by a $\log T$ factor, matching the correct rates of $1/T$ and $1/\sqrt{T}$, respectively. In the more challenging case of heavytailed polynomial data, we improve the existing bound by a $\mathrm{poly}\ T$ factor.more » « less

We study stochastic approximation procedures for approximately solving a $d$dimensional linear fixed point equation based on observing a trajectory of length $n$ from an ergodic Markov chain. We first exhibit a nonasymptotic bound of the order $t_{\mathrm{mix}} \tfrac{d}{n}$ on the squared error of the last iterate of a standard scheme, where $t_{\mathrm{mix}}$ is a mixing time. We then prove a nonasymptotic instancedependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters $(d, t_{\mathrm{mix}})$ in the higher order terms. We complement these upper bounds with a nonasymptotic minimax lower bound that establishes the instanceoptimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise—covering the TD($\lambda$) family of algorithms for all $\lambda \in [0, 1)$—and linear autoregressive models. Our instancedependent characterizations open the door to the design of finegrained model selection procedures for hyperparameter tuning (e.g., choosing the value of $\lambda$ when running the TD($\lambda$) algorithm).more » « less

We consider a distributed function computation problem in which parties observing noisy versions of a remote source facilitate the computation of a function of their observations at a fusion center through public communication. The distributed function computation is subject to constraints, including not only reliability and storage but also privacy and secrecy. Specifically, 1) the remote source should remain private from an eavesdropper and the fusion center, measured in terms of the information leaked about the remote source; 2) the function computed should remain secret from the eavesdropper, measured in terms of the information leaked about the arguments of the function, to ensure secrecy regardless of the exact function used. We derive the exact rate regions for lossless and lossy singlefunction computation and illustrate the lossy singlefunction computation rate region for an information bottleneck example, in which the optimal auxiliary random variables are characterized for binary input symmetric output channels. We extend the approach to lossless and lossy asynchronous multiplefunction computations with joint secrecy and privacy constraints, in which case inner and outer bounds for the rate regions differing only in the Markov chain conditions imposed are characterized.more » « less