skip to main content


Title: A Two-Moment Inequality with Applications to Rényi Entropy and Mutual Information
This paper explores some applications of a two-moment inequality for the integral of the rth power of a function, where 0 more » « less
Award ID(s):
1750362
NSF-PAR ID:
10221391
Author(s) / Creator(s):
Date Published:
Journal Name:
Entropy
Volume:
22
Issue:
11
ISSN:
1099-4300
Page Range / eLocation ID:
1244
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    For probabilistic programs, it is usually not possible to automatically derive exact information about their properties, such as the distribution of states at a given program point. Instead, one can attempt to derive approximations, such as upper bounds on tail probabilities. Such bounds can be obtained via concentration inequalities, which rely on the moments of a distribution, such as the expectation (the first raw moment) or the variance (the second central moment). Tail bounds obtained using central moments are often tighter than the ones obtained using raw moments, but automatically analyzing central moments is more challenging. This paper presents an analysis for probabilistic programs that automatically derives symbolic upper and lower bounds on variances, as well as higher central moments, of cost accumulators. To overcome the challenges of higher-moment analysis, it generalizes analyses for expectations with an algebraic abstraction that simultaneously analyzes different moments, utilizing relations between them. A key innovation is the notion of moment-polymorphic recursion, and a practical derivation system that handles recursive functions. The analysis has been implemented using a template-based technique that reduces the inference of polynomial bounds to linear programming. Experiments with our prototype central-moment analyzer show that, despite the analyzer’s upper/lower bounds on various quantities, it obtains tighter tail bounds than an existing system that uses only raw moments, such as expectations. 
    more » « less
  2. Abstract

    We study the distribution over measurement outcomes of noisy random quantum circuits in the regime of low fidelity, which corresponds to the setting where the computation experiences at least one gate-level error with probability close to one. We model noise by adding a pair of weak, unital, single-qubit noise channels after each two-qubit gate, and we show that for typical random circuit instances, correlations between the noisy output distribution$$p_{\text {noisy}}$$pnoisyand the corresponding noiseless output distribution$$p_{\text {ideal}}$$pidealshrink exponentially with the expected number of gate-level errors. Specifically, the linear cross-entropy benchmarkFthat measures this correlation behaves as$$F=\text {exp}(-2s\epsilon \pm O(s\epsilon ^2))$$F=exp(-2sϵ±O(sϵ2)), where$$\epsilon $$ϵis the probability of error per circuit location andsis the number of two-qubit gates. Furthermore, if the noise is incoherent—for example, depolarizing or dephasing noise—the total variation distance between the noisy output distribution$$p_{\text {noisy}}$$pnoisyand the uniform distribution$$p_{\text {unif}}$$punifdecays at precisely the same rate. Consequently, the noisy output distribution can be approximated as$$p_{\text {noisy}}\approx Fp_{\text {ideal}}+ (1-F)p_{\text {unif}}$$pnoisyFpideal+(1-F)punif. In other words, although at least one local error occurs with probability$$1-F$$1-F, the errors are scrambled by the random quantum circuit and can be treated as global white noise, contributing completely uniform output. Importantly, we upper bound the average total variation error in this approximation by$$O(F\epsilon \sqrt{s})$$O(Fϵs). Thus, the “white-noise approximation” is meaningful when$$\epsilon \sqrt{s} \ll 1$$ϵs1, a quadratically weaker condition than the$$\epsilon s\ll 1$$ϵs1requirement to maintain high fidelity. The bound applies if the circuit size satisfies$$s \ge \Omega (n\log (n))$$sΩ(nlog(n)), which corresponds to onlylogarithmic depthcircuits, and if, additionally, the inverse error rate satisfies$$\epsilon ^{-1} \ge {\tilde{\Omega }}(n)$$ϵ-1Ω~(n), which is needed to ensure errors are scrambled faster thanFdecays. The white-noise approximation is useful for salvaging the signal from a noisy quantum computation; for example, it was an underlying assumption in complexity-theoretic arguments that noisy random quantum circuits cannot be efficiently sampled classically, even when the fidelity is low. Our method is based on a map from second-moment quantities in random quantum circuits to expectation values of certain stochastic processes for which we compute upper and lower bounds.

     
    more » « less
  3. We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution $\mu$ on $k$-sized subsets of a ground set of elements says that for any (possibly randomly chosen) set $S$, the relative entropy of a single element of $S$ drawn uniformly at random carries at most $O(1/k)$ fraction of the relative entropy of $S$. Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. We furthermore extend our theory to the case where spectral independence does not hold under arbitrary external fields. To do this, we introduce a framework for obtaining tight mixing time bounds for Markov chains based on what we call restricted modified log-Sobolev inequalities, which guarantee entropy contraction not for all distributions, but for those in a sufficiently large neighborhood of the stationary distribution. To derive our results, we relate entropic independence to properties of polynomials: $\mu$ is entropically independent exactly when a transformed version of the generating polynomial of $\mu$ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain (1) tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions, (2) the tight mixing time of $O(n\log n)$ for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than $1$, improving upon the prior quadratic dependence on $n$, and (3) nearly-linear time $\widetilde O_{\delta}(n)$ samplers for the hardcore and Ising models on $n$-node graphs that have $\delta$-relative gap to the tree-uniqueness threshold. In the last application, our bound on the running time does not depend on the maximum degree $\Delta$ of the graph, and is therefore optimal even for high-degree graphs, and in fact, is sublinear in the size of the graph for high-degree graphs. 
    more » « less
  4. Summary

    The upper bounds on the coverage probabilities of the confidence regions based on blockwise empirical likelihood and non-standard expansive empirical likelihood methods for time series data are investigated via studying the probability of violating the convex hull constraint. The large sample bounds are derived on the basis of the pivotal limit of the blockwise empirical log-likelihood ratio obtained under fixed b asymptotics, which has recently been shown to provide a more accurate approximation to the finite sample distribution than the conventional χ2-approximation. Our theoretical and numerical findings suggest that both the finite sample and the large sample upper bounds for coverage probabilities are strictly less than 1 and the blockwise empirical likelihood confidence region can exhibit serious undercoverage when the dimension of moment conditions is moderate or large, the time series dependence is positively strong or the block size is large relative to the sample size. A similar finite sample coverage problem occurs for non-standard expansive empirical likelihood. To alleviate the coverage bound problem, we propose to penalize both empirical likelihood methods by relaxing the convex hull constraint. Numerical simulations and data illustrations demonstrate the effectiveness of our proposed remedies in terms of delivering confidence sets with more accurate coverage. Some technical details and additional simulation results are included in on-line supplemental material.

     
    more » « less
  5. Abstract

    We study the performance of Markov chains for theq-state ferromagnetic Potts model on random regular graphs. While the cases of the grid and the complete graph are by now well-understood, the case of random regular graphs has resisted a detailed analysis and, in fact, even analysing the properties of the Potts distribution has remained elusive. It is conjectured that the performance of Markov chains is dictated by metastability phenomena, i.e., the presence of “phases” (clusters) in the sample space where Markov chains with local update rules, such as the Glauber dynamics, are bound to take exponential time to escape, and therefore cause slow mixing. The phases that are believed to drive these metastability phenomena in the case of the Potts model emerge as local, rather than global, maxima of the so-called Bethe functional, and previous approaches of analysing these phases based on optimisation arguments fall short of the task. Our first contribution is to detail the emergence of the two relevant phases for theq-state Potts model on thed-regular random graph for all integers$$q,d\ge 3$$q,d3, and establish that for an interval of temperatures, delineated by the uniqueness and a broadcasting threshold on thed-regular tree, the two phases coexist (as possible metastable states). The proofs are based on a conceptual connection between spatial properties and the structure of the Potts distribution on the random regular graph, rather than complicated moment calculations. This significantly refines earlier results by Helmuth, Jenssen, and Perkins who had established phase coexistence for a small interval around the so-called ordered-disordered threshold (via different arguments) that applied for largeqand$$d\ge 5$$d5. Based on our new structural understanding of the model, our second contribution is to obtain metastability results for two classical Markov chains for the Potts model. We first complement recent fast mixing results for Glauber dynamics by Blanca and Gheissari below the uniqueness threshold, by showing an exponential lower bound on the mixing time above the uniqueness threshold. Then, we obtain tight results even for the non-local and more elaborate Swendsen–Wang chain, where we establish slow mixing/metastability for the whole interval of temperatures where the chain is conjectured to mix slowly on the random regular graph. The key is to bound the conductance of the chains using a random graph “planting” argument combined with delicate bounds on random-graph percolation.

     
    more » « less