skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Sparse confidence sets for normal mean models
Abstract In this paper, we propose a new framework to construct confidence sets for a $$d$$-dimensional unknown sparse parameter $${\boldsymbol \theta }$$ under the normal mean model $${\boldsymbol X}\sim N({\boldsymbol \theta },\sigma ^{2}\bf{I})$$. A key feature of the proposed confidence set is its capability to account for the sparsity of $${\boldsymbol \theta }$$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resampling-based procedures, where the sparsity of $${\boldsymbol \theta }$$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for $${\boldsymbol \theta }$$ is above a pre-specified level; (ii) there exists a random subset $$S$$ of $$\{1,...,d\}$$ such that $$S$$ guarantees the pre-specified true negative rate for detecting non-zero $$\theta _{j}$$’s. To exploit the sparsity of $${\boldsymbol \theta }$$, we allow the confidence interval for $$\theta _{j}$$ to degenerate to a single point 0 for any $$j\notin S$$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a non-asymptotic minimax lower bound for the non-coverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signal-to-noise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a two-stage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.  more » « less
Award ID(s):
2134209
PAR ID:
10402423
Author(s) / Creator(s):
;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Information and Inference: A Journal of the IMA
Volume:
12
Issue:
3
ISSN:
2049-8772
Page Range / eLocation ID:
p. 1193-1247
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Dalalyan, Aynak (Ed.)
    Topic models have become popular tools for dimension reduction and exploratory analysis of text data which consists in observed frequencies of a vocabulary of p words in n documents, stored in a p×n matrix. The main premise is that the mean of this data matrix can be factorized into a product of two non-negative matrices: a p×K word-topic matrix A and a K×n topic-document matrix W. This paper studies the estimation of A that is possibly element-wise sparse, and the number of topics K is unknown. In this under-explored context, we derive a new minimax lower bound for the estimation of such A and propose a new computationally efficient algorithm for its recovery. We derive a finite sample upper bound for our estimator, and show that it matches the minimax lower bound in many scenarios. Our estimate adapts to the unknown sparsity of A and our analysis is valid for any finite n, p, K and document lengths. Empirical results on both synthetic data and semi-synthetic data show that our proposed estimator is a strong competitor of the existing state-of-the-art algorithms for both non-sparse A and sparse A, and has superior performance is many scenarios of interest. 
    more » « less
  2. Abstract Let $$\mathcal {N}(b)$$ be the set of real numbers that are normal to base b . A well-known result of Ki and Linton [19] is that $$\mathcal {N}(b)$$ is $$\boldsymbol {\Pi }^0_3$$ -complete. We show that the set $${\mathcal {N}}^\perp (b)$$ of reals, which preserve $$\mathcal {N}(b)$$ under addition, is also $$\boldsymbol {\Pi }^0_3$$ -complete. We use the characterization of $${\mathcal {N}}^\perp (b),$$ given by Rauzy, in terms of an entropy-like quantity called the noise . It follows from our results that no further characterization theorems could result in a still better bound on the complexity of $${\mathcal {N}}^\perp (b)$$ . We compute the exact descriptive complexity of other naturally occurring sets associated with noise. One of these is complete at the $$\boldsymbol {\Pi }^0_4$$ level. Finally, we get upper and lower bounds on the Hausdorff dimension of the level sets associated with the noise. 
    more » « less
  3. High-dimensional linear models with endogenous variables play an increasingly important role in the recent econometric literature. In this work, we allow for models with many endogenous variables and make use of many instrumental variables to achieve identification. Because of the high-dimensionality in the structural equation, constructing honest confidence regions with asymptotically correct coverage is non-trivial. Our main contribution is to propose estimators and confidence regions that achieve this goal. Our approach relies on moment conditions that satisfy the usual instrument orthogonality condition but also have an additional orthogonality property with respect to specific linear combinations of the endogenous variables which are treated as nuisance parameters. We propose new pivotal procedures for estimating the high-dimensional nuisance parameters which appear in our formulation. We use a multiplier bootstrap procedure to compute critical values and establish its validity for achieving simultaneously valid confidence regions for a potentially high-dimensional set of endogenous variable coefficients. 
    more » « less
  4. null (Ed.)
    Driven by a wide range of applications, several principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a uni- fied framework for the statistical analysis of a general structured principal subspace estima- tion problem which includes as special cases sparse PCA/SVD, non-negative PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and up- per bounds are established to characterize the interplay between the information-geometric complexity of the constraint set for the principal subspaces, the signal-to-noise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concern- ing the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the mini- max rates of convergence for those problems, including the previous unknown optimal rates for sparse SVD, non-negative PCA/SVD and subspace constrained PCA/SVD. 
    more » « less
  5. Oja's algorithm for Streaming Principal Component Analysis (PCA) for n data-points in a d dimensional space achieves the same sin-squared error O(r𝖾𝖿𝖿/n) as the offline algorithm in O(d) space and O(nd) time and a single pass through the datapoints. Here r𝖾𝖿𝖿 is the effective rank (ratio of the trace and the principal eigenvalue of the population covariance matrix Σ). Under this computational budget, we consider the problem of sparse PCA, where the principal eigenvector of Σ is s-sparse, and r𝖾𝖿𝖿 can be large. In this setting, to our knowledge, \textit{there are no known single-pass algorithms} that achieve the minimax error bound in O(d) space and O(nd) time without either requiring strong initialization conditions or assuming further structure (e.g., spiked) of the covariance matrix. We show that a simple single-pass procedure that thresholds the output of Oja's algorithm (the Oja vector) can achieve the minimax error bound under some regularity conditions in O(d) space and O(nd) time. We present a nontrivial and novel analysis of the entries of the unnormalized Oja vector, which involves the projection of a product of independent random matrices on a random initial vector. This is completely different from previous analyses of Oja's algorithm and matrix products, which have been done when the r𝖾𝖿𝖿 is bounded. 
    more » « less