skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Unified lower bounds for interactive high-dimensional estimation under information constraints
We consider distributed parameter estimation using interactive protocols subject to local information constraints such as bandwidth limitations, local differential privacy, and restricted measurements. We provide a unified framework enabling us to derive a variety of (tight) minimax lower bounds for different parametric families of distributions, both continuous and discrete, under any Lp loss. Our lower bound framework is versatile and yields “plug-and-play” bounds that are widely applicable to a large range of estimation problems, and, for the prototypical case of the Gaussian family, circumvents limitations of previous techniques. In particular, our approach recovers bounds obtained using data processing inequalities and Cramér–Rao bounds, two other alternative approaches for proving lower bounds in our setting of interest. Further, for the families considered, we complement our lower bounds with matching upper bounds.  more » « less
Award ID(s):
1846300
PAR ID:
10541067
Author(s) / Creator(s):
; ; ;
Editor(s):
Oh, A; Naumann, T; Globerson, A; Saenko, K; Hardt, M; Levine, S
Publisher / Repository:
Curran Associates, Inc.
Date Published:
Format(s):
Medium: X
Location:
New Orleans, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. We study robust testing and estimation of discrete distributions in the strong contamination model. Our results cover both centralized setting and distributed setting with general local information constraints including communication and LDP constraints. Our technique relates the strength of manipulation attacks to the earth-mover distance using Hamming distance as the metric between messages (samples) from the users. In the centralized setting, we provide optimal error bounds for both learning and testing. Our lower bounds under local information constraints build on the recent lower bound methods in distributed inference. In the communication constrained setting, we develop novel algorithms based on random hashing and an L1-L1 isometry. 
    more » « less
  2. Estimation of Shannon and Rényi entropies of unknown discrete distributions is a fundamental problem in statistical property testing. In this paper, we give the first quantum algorithms for estimating α-Rényi entropies (Shannon entropy being 1-Rényi entropy). In particular, we demonstrate a quadratic quantum speedup for Shannon entropy estimation and a generic quantum speedup for α-Rényi entropy estimation for all α ≥ 0, including tight bounds for the Shannon entropy, the Hartley entropy (α = 0), and the collision entropy (α = 2). We also provide quantum upper bounds for estimating min-entropy (α = +∞) as well as the Kullback-Leibler divergence. We complement our results with quantum lower bounds on α- Rényi entropy estimation for all α ≥ 0. Our approach is inspired by the pioneering work of Bravyi, Harrow, and Hassidim (BHH) [1], however, with many new technical ingredients: (1) we improve the error dependence of the BHH framework by a fine-tuned error analysis together with Montanaro’s approach to estimating the expected output of quantum subroutines [2] for α = 0, 1; (2) we develop a procedure, similar to cooling schedules in simulated annealing, for general α ≥ 0; (3) in the cases of integer α ≥ 2 and α = +∞, we reduce the entropy estimation problem to the α-distinctness and the ⌈log n⌉-distinctness problems, respectively. 
    more » « less
  3. We consider the problem of estimating sparse discrete distributions under local differential privacy (LDP) and communication constraints. We characterize the sample complexity for sparse estimation under LDP constraints up to a constant factor, and the sample complexity under communication constraints up to a logarithmic factor. Our upper bounds under LDP are based on the Hadamard Response, a private coin scheme that requires only one bit of communication per user. Under communication constraints we propose public coin schemes based on random hashing functions. Our tight lower bounds are based on recently proposed method of chi squared contractions. 
    more » « less
  4. The level set estimation problem seeks to find all points in a domain  where the value of an unknown function 𝑓:→ℝ exceeds a threshold 𝛼 . The estimation is based on noisy function evaluations that may be acquired at sequentially and adaptively chosen locations in  . The threshold value 𝛼 can either be explicit and provided a priori, or implicit and defined relative to the optimal function value, i.e. 𝛼=(1−𝜖)𝑓(𝐱∗) for a given 𝜖>0 where 𝑓(𝐱∗) is the maximal function value and is unknown. In this work we provide a new approach to the level set estimation problem by relating it to recent adaptive experimental design methods for linear bandits in the Reproducing Kernel Hilbert Space (RKHS) setting. We assume that 𝑓 can be approximated by a function in the RKHS up to an unknown misspecification and provide novel algorithms for both the implicit and explicit cases in this setting with strong theoretical guarantees. Moreover, in the linear (kernel) setting, we show that our bounds are nearly optimal, namely, our upper bounds match existing lower bounds for threshold linear bandits. To our knowledge this work provides the first instance-dependent, non-asymptotic upper bounds on sample complexity of level-set estimation that match information theoretic lower bounds. 
    more » « less
  5. We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements, motivated by the federated learn- ing (FL) framework. We propose a distributed communication-efficient and local differentially private stochastic gradient descent (CLDP-SGD) algorithm and analyze its communication, privacy, and convergence trade-offs. Since each iteration of the CLDP- SGD aggregates the client-side local gradients, we develop (optimal) communication-efficient schemes for mean estimation for several lp spaces under local differential privacy (LDP). To overcome performance limitation of LDP, CLDP-SGD takes advantage of the inherent privacy amplification provided by client sub- sampling and data subsampling at each se- lected client (through SGD) as well as the recently developed shuffled model of privacy. For convex loss functions, we prove that the proposed CLDP-SGD algorithm matches the known lower bounds on the centralized private ERM while using a finite number of bits per iteration for each client, i.e., effectively get- ting communication efficiency for “free”. We also provide preliminary experimental results supporting the theory. 
    more » « less