Differential privacy provides a rigorous framework for privacy-preserving data analysis. This paper proposes the first differentially private procedure for controlling the false discovery rate (FDR) in multiple hypothesis testing. Inspired by the Benjamini-Hochberg procedure (BHq), our approach is to first repeatedly add noise to the logarithms of the p-values to ensure differential privacy and to select an approximately smallest p-value serving as a promising candidate at each iteration; the selected p-values are further supplied to the BHq and our private procedure releases only the rejected ones. Moreover, we develop a new technique that is based on a backward submartingale for proving FDR control of a broad class of multiple testing procedures, including our private procedure, and both the BHq step- up and step-down procedures. As a novel aspect, the proof works for arbitrary dependence between the true null and false null test statistics, while FDR control is maintained up to a small multiplicative factor.
more »
« less
False Discovery Rate Control with E-values
Abstract E-values have gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. In brief, e-values are realized by random variables with expectation at most one under the null; examples include betting scores, (point null) Bayes factors, likelihood ratios and stopped supermartingales. We design a natural analogue of the Benjamini-Hochberg (BH) procedure for false discovery rate (FDR) control that utilizes e-values, called the e-BH procedure, and compare it with the standard procedure for p-values. One of our central results is that, unlike the usual BH procedure, the e-BH procedure controls the FDR at the desired level—with no correction—for any dependence structure between the e-values. We illustrate that the new procedure is convenient in various settings of complicated dependence, structured and post-selection hypotheses, and multi-armed bandit problems. Moreover, the BH procedure is a special case of the e-BH procedure through calibration between p-values and e-values. Overall, the e-BH procedure is a novel, powerful and general tool for multiple testing under dependence, that is complementary to the BH procedure, each being an appropriate choice in different applications.
more »
« less
- Award ID(s):
- 1945266
- PAR ID:
- 10398631
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- Volume:
- 84
- Issue:
- 3
- ISSN:
- 1369-7412
- Format(s):
- Medium: X Size: p. 822-852
- Size(s):
- p. 822-852
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage e-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analysing data from the UK Biobank.more » « less
-
ABSTRACT In many applications, the process of identifying a specific feature of interest often involves testing multiple hypotheses for their joint statistical significance. Examples include mediation analysis, which simultaneously examines the existence of the exposure-mediator and the mediator-outcome effects, and replicability analysis, aiming to identify simultaneous signals that exhibit statistical significance across multiple independent studies. In this work, we present a new approach called the joint mirror (JM) procedure that effectively detects such features while maintaining false discovery rate (FDR) control in finite samples. The JM procedure employs an iterative method that gradually shrinks the rejection region based on progressively revealed information until a conservative estimate of the false discovery proportion is below the target FDR level. Additionally, we introduce a more stringent error measure known as the composite FDR (cFDR), which assigns weights to each false discovery based on its number of null components. We use the leave-one-out technique to prove that the JM procedure controls the cFDR in finite samples. To implement the JM procedure, we propose an efficient algorithm that can incorporate partial ordering information. Through extensive simulations, we show that our procedure effectively controls the cFDR and enhances statistical power across various scenarios, including the case that test statistics are dependent across the features. Finally, we showcase the utility of our method by applying it to real-world mediation and replicability analyses.more » « less
-
One important partition of algorithms for controlling the false discovery rate (FDR) in multiple testing is into offline and online algorithms. The first generally achieve significantly higher power of discovery, while the latter allow making decisions sequentially as well as adaptively formulating hypotheses based on past observations. Using existing methodology, it is unclear how one could trade off the benefits of these two broad families of algorithms, all the while preserving their formal FDR guarantees. To this end, we introduce Batch-BH and Batch-St-BH, algorithms for controlling the FDR when a possibly infinite sequence of batches of hypotheses is tested by repeated application of one of the most widely used offline algorithms, the Benjamini-Hochberg (BH) method or Storey’s improvement of the BH method. We show that our algorithms interpolate between existing online and offline methodology, thus trading off the best of both worlds.more » « less
-
In bandit multiple hypothesis testing, each arm corresponds to a different null hypothesis that we wish to test, and the goal is to design adaptive algorithms that correctly identify large set of interesting arms (true discoveries), while only mistakenly identifying a few uninteresting ones (false discoveries). One common metric in non-bandit multiple testing is the false discovery rate (FDR). We propose a unified, modular framework for bandit FDR control that emphasizes the decoupling of exploration and summarization of evidence. We utilize the powerful martingale-based concept of “e-processes” to ensure FDR control for arbitrary composite nulls, exploration rules and stopping times in generic problem settings. In particular, valid FDR control holds even if the reward distributions of the arms could be dependent, multiple arms may be queried simultaneously, and multiple (cooperating or competing) agents may be querying arms, covering combinatorial semi-bandit type settings as well. Prior work has considered in great detail the setting where each arm’s reward distribution is independent and sub-Gaussian, and a single arm is queried at each step. Our framework recovers matching sample complexity guarantees in this special case, and performs comparably or better in practice. For other settings, sample complexities will depend on the finer details of the problem (composite nulls being tested, exploration algorithm, data dependence structure, stopping rule) and we do not explore these; our contribution is to show that the FDR guarantee is clean and entirely agnostic to these details.more » « less
An official website of the United States government
