In this paper, we propose a new framework to construct confidence sets for a $d$dimensional unknown sparse parameter ${\boldsymbol \theta }$ under the normal mean model ${\boldsymbol X}\sim N({\boldsymbol \theta },\sigma ^{2}\bf{I})$. A key feature of the proposed confidence set is its capability to account for the sparsity of ${\boldsymbol \theta }$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resamplingbased procedures, where the sparsity of ${\boldsymbol \theta }$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for ${\boldsymbol \theta }$ is above a prespecified level; (ii) there exists a random subset $S$ of $\{1,...,d\}$ such that $S$ guarantees the prespecified true negative rate for detecting nonzero $\theta _{j}$’s. To exploit the sparsity of ${\boldsymbol \theta }$, we allow the confidence interval for $\theta _{j}$ to degenerate to a single point 0 for any $j\notin S$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a nonasymptotic minimax lower bound for the noncoverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signaltonoise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a twostage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.
more » « less Award ID(s):
 2134209
 NSFPAR ID:
 10402423
 Publisher / Repository:
 Oxford University Press
 Date Published:
 Journal Name:
 Information and Inference: A Journal of the IMA
 Volume:
 12
 Issue:
 3
 ISSN:
 20498772
 Page Range / eLocation ID:
 p. 11931247
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Dalalyan, Aynak (Ed.)Topic models have become popular tools for dimension reduction and exploratory analysis of text data which consists in observed frequencies of a vocabulary of p words in n documents, stored in a p×n matrix. The main premise is that the mean of this data matrix can be factorized into a product of two nonnegative matrices: a p×K wordtopic matrix A and a K×n topicdocument matrix W. This paper studies the estimation of A that is possibly elementwise sparse, and the number of topics K is unknown. In this underexplored context, we derive a new minimax lower bound for the estimation of such A and propose a new computationally efficient algorithm for its recovery. We derive a finite sample upper bound for our estimator, and show that it matches the minimax lower bound in many scenarios. Our estimate adapts to the unknown sparsity of A and our analysis is valid for any finite n, p, K and document lengths. Empirical results on both synthetic data and semisynthetic data show that our proposed estimator is a strong competitor of the existing stateoftheart algorithms for both nonsparse A and sparse A, and has superior performance is many scenarios of interest.more » « less

Lowrank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by nonconvex optimization as it is wellknown that for lowrank matrix problems like matrix sensing and matrix completion, all local optima of the natural nonconvex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semirandom model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a lowrank matrix $X^\star$ from linear measurements $b_i = \langle A_i, X^\star \rangle$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $A_i$'s are chosen adversarially. It is known that in the semirandom model, existing nonconvex objectives can have bad local optima. To fix this, we present a descentstyle algorithm that provably recovers the groundtruth matrix $X^\star$. For the closelyrelated problem of semirandom matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NPhard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semirandom sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and lowrankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and lowrank problems.more » « less

Summary The upper bounds on the coverage probabilities of the confidence regions based on blockwise empirical likelihood and nonstandard expansive empirical likelihood methods for time series data are investigated via studying the probability of violating the convex hull constraint. The large sample bounds are derived on the basis of the pivotal limit of the blockwise empirical loglikelihood ratio obtained under fixed b asymptotics, which has recently been shown to provide a more accurate approximation to the finite sample distribution than the conventional χ2approximation. Our theoretical and numerical findings suggest that both the finite sample and the large sample upper bounds for coverage probabilities are strictly less than 1 and the blockwise empirical likelihood confidence region can exhibit serious undercoverage when the dimension of moment conditions is moderate or large, the time series dependence is positively strong or the block size is large relative to the sample size. A similar finite sample coverage problem occurs for nonstandard expansive empirical likelihood. To alleviate the coverage bound problem, we propose to penalize both empirical likelihood methods by relaxing the convex hull constraint. Numerical simulations and data illustrations demonstrate the effectiveness of our proposed remedies in terms of delivering confidence sets with more accurate coverage. Some technical details and additional simulation results are included in online supplemental material.

null (Ed.)Driven by a wide range of applications, several principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a uni fied framework for the statistical analysis of a general structured principal subspace estima tion problem which includes as special cases sparse PCA/SVD, nonnegative PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and up per bounds are established to characterize the interplay between the informationgeometric complexity of the constraint set for the principal subspaces, the signaltonoise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concern ing the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the mini max rates of convergence for those problems, including the previous unknown optimal rates for sparse SVD, nonnegative PCA/SVD and subspace constrained PCA/SVD.more » « less

Abstract Predicting sets of outcomes—instead of unique outcomes—is a promising solution to uncertainty quantification in statistical learning. Despite a rich literature on constructing prediction sets with statistical guarantees, adapting to unknown covariate shift—a prevalent issue in practice—poses a serious unsolved challenge. In this article, we show that prediction sets with finitesample coverage guarantee are uninformative and propose a novel flexible distributionfree method, PredSet1Step, to efficiently construct prediction sets with an asymptotic coverage guarantee under unknown covariate shift. We formally show that our method is asymptotically probably approximately correct, having wellcalibrated coverage error with high confidence for large samples. We illustrate that it achieves nominal coverage in a number of experiments and a data set concerning HIV risk prediction in a South African cohort study. Our theory hinges on a new bound for the convergence rate of the coverage of Wald confidence intervals based on general asymptotically linear estimators.