Abstract In this paper, we propose a new framework to construct confidence sets for a $$d$$-dimensional unknown sparse parameter $${\boldsymbol \theta }$$ under the normal mean model $${\boldsymbol X}\sim N({\boldsymbol \theta },\sigma ^{2}\bf{I})$$. A key feature of the proposed confidence set is its capability to account for the sparsity of $${\boldsymbol \theta }$$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resampling-based procedures, where the sparsity of $${\boldsymbol \theta }$$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for $${\boldsymbol \theta }$$ is above a pre-specified level; (ii) there exists a random subset $$S$$ of $$\{1,...,d\}$$ such that $$S$$ guarantees the pre-specified true negative rate for detecting non-zero $$\theta _{j}$$’s. To exploit the sparsity of $${\boldsymbol \theta }$$, we allow the confidence interval for $$\theta _{j}$$ to degenerate to a single point 0 for any $$j\notin S$$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a non-asymptotic minimax lower bound for the non-coverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signal-to-noise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a two-stage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.
more »
« less
Optimal Confidence Sets for the Multinomial Parameter
Construction of tight confidence sets and intervals is central to statistical inference and decision making. This paper develops new theory showing minimum average volume confidence sets for categorical data. More precisely, consider an empirical distribution pˆ generated from n iid realizations of a random variable that takes one of k possible values according to an unknown distribution p . This is analogous to a single draw from a multinomial distribution. A confidence set is a subset of the probability simplex that depends on pˆ and contains the unknown p with a specified confidence. This paper shows how one can construct minimum average volume confidence sets. The optimality of the sets translates to improved sample complexity for adaptive machine learning algorithms that rely on confidence sets, regions and intervals.
more »
« less
- Award ID(s):
- 2023239
- PAR ID:
- 10533098
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 978-1-5386-8209-8
- Page Range / eLocation ID:
- 2173 to 2178
- Format(s):
- Medium: X
- Location:
- Melbourne, Australia
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Summary Since the introduction of fiducial inference by Fisher in the 1930s, its application has been largely confined to relatively simple, parametric problems. In this paper, we present what might be the first time fiducial inference is systematically applied to estimation of a nonparametric survival function under right censoring. We find that the resulting fiducial distribution gives rise to surprisingly good statistical procedures applicable to both one-sample and two-sample problems. In particular, we use the fiducial distribution of a survival function to construct pointwise and curvewise confidence intervals for the survival function, and propose tests based on the curvewise confidence interval. We establish a functional Bernstein–von Mises theorem, and perform thorough simulation studies in scenarios with different levels of censoring. The proposed fiducial-based confidence intervals maintain coverage in situations where asymptotic methods often have substantial coverage problems. Furthermore, the average length of the proposed confidence intervals is often shorter than the length of confidence intervals for competing methods that maintain coverage. Finally, the proposed fiducial test is more powerful than various types of log-rank tests and sup log-rank tests in some scenarios. We illustrate the proposed fiducial test by comparing chemotherapy against chemotherapy combined with radiotherapy, using data from the treatment of locally unresectable gastric cancer.more » « less
-
A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data. The approach is data-adaptive and requires no knowledge of the data distribution or of the details of the sketching algorithm; instead, it constructs provably valid frequentist confidence intervals under the sole assumption of data exchangeability. Although our solution is broadly applicable, this paper focuses on applications involving the count-min sketch algorithm and a non-linear variation thereof. The performance is compared to that of frequentist and Bayesian alternatives through simulations and experiments with data sets of SARS-CoV-2 DNA sequences and classic English literature.more » « less
-
We study statistical inference and distributionally robust solution methods for stochastic optimization problems, focusing on confidence intervals for optimal values and solutions that achieve exact coverage asymptotically. We develop a generalized empirical likelihood framework—based on distributional uncertainty sets constructed from nonparametric f-divergence balls—for Hadamard differentiable functionals, and in particular, stochastic optimization problems. As consequences of this theory, we provide a principled method for choosing the size of distributional uncertainty regions to provide one- and two-sided confidence intervals that achieve exact coverage. We also give an asymptotic expansion for our distributionally robust formulation, showing how robustification regularizes problems by their variance. Finally, we show that optimizers of the distributionally robust formulations we study enjoy (essentially) the same consistency properties as those in classical sample average approximations. Our general approach applies to quickly mixing stationary sequences, including geometrically ergodic Harris recurrent Markov chains.more » « less
-
Because the average treatment effect (ATE) measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population. Assessing such risk is difficult, however, because any one individual treatment effect (ITE) is never observed, so the 10% worst-affected cannot be identified, whereas distributional treatment effects only compare the first deciles within each treatment group, which does not correspond to any 10% subpopulation. In this paper, we consider how to nonetheless assess this important risk measure, formalized as the conditional value at risk (CVaR) of the ITE distribution. We leverage the availability of pretreatment covariates and characterize the tightest possible upper and lower bounds on ITE-CVaR given by the covariate-conditional average treatment effect (CATE) function. We then proceed to study how to estimate these bounds efficiently from data and construct confidence intervals. This is challenging even in randomized experiments as it requires understanding the distribution of the unknown CATE function, which can be very complex if we use rich covariates to best control for heterogeneity. We develop a debiasing method that overcomes this and prove it enjoys favorable statistical properties even when CATE and other nuisances are estimated by black box machine learning or even inconsistently. Studying a hypothetical change to French job search counseling services, our bounds and inference demonstrate a small social benefit entails a negative impact on a substantial subpopulation. This paper was accepted by J. George Shanthikumar, data science. Funding: This work was supported by the Division of Information and Intelligent Systems [Grant 1939704]. Supplemental Material: The data files and online appendices are available at https://doi.org/10.1287/mnsc.2023.4819 .more » « less
An official website of the United States government

