Motivated by a multimodal neuroimaging study for Alzheimer's disease, in this article, we study the inference problem, that is, hypothesis testing, of sequential mediation analysis. The existing sequential mediation solutions mostly focus on sparse estimation, while hypothesis testing is an utterly different and more challenging problem. Meanwhile, the few mediation testing solutions often ignore the potential dependency among the mediators or cannot be applied to the sequential problem directly. We propose a statistical inference procedure to test mediation pathways when there are sequentially ordered multiple data modalities and each modality involves multiple mediators. We allow the mediators to be conditionally dependent and the number of mediators within each modality to diverge with the sample size. We produce the explicit significance quantification and establish theoretical guarantees in terms of asymptotic size, power, and false discovery control. We demonstrate the efficacy of the method through both simulations and an application to a multimodal neuroimaging pathway analysis of Alzheimer's disease.
- Award ID(s):
- 1846832
- PAR ID:
- 10402812
- Date Published:
- Journal Name:
- Econometrica
- Volume:
- 91
- Issue:
- 1
- ISSN:
- 0012-9682
- Page Range / eLocation ID:
- 299 to 327
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized α-investing framework which enables control of the marginal false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of α-wealth which motivates a consideration of sample size in the α-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected α-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where n is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.
-
In this work, we show that for a nontrivial hypothesis class C, we can estimate the distance of a target function f to C (estimate the error rate of the best h∈C) using substantially fewer labeled examples than would be needed to actually {\em learn} a good h∈C. Specifically, we show that for the class C of unions of d intervals on the line, in the active learning setting in which we have access to a pool of unlabeled examples drawn from an arbitrary underlying distribution D, we can estimate the error rate of the best h∈C to an additive error ϵ with a number of label requests that is {\em independent of d} and depends only on ϵ. In particular, we make O((1/ϵ^6)log(1/ϵ)) label queries to an unlabeled pool of size O((d/ϵ^2)log(1/ϵ)). This task of estimating the distance of an unknown f to a given class C is called {\em tolerant testing} or {\em distance estimation} in the testing literature, usually studied in a membership query model and with respect to the uniform distribution. Our work extends that of Balcan et al. (2012) who solved the {\em non}-tolerant testing problem for this class (distinguishing the zero-error case from the case that the best hypothesis in the class has error greater than ϵ). We also consider the related problem of estimating the performance of a given learning algorithm A in this setting. That is, given a large pool of unlabeled examples drawn from distribution D, can we, from only a few label queries, estimate how well A would perform if the entire dataset were labeled and given as training data to A? We focus on k-Nearest Neighbor style algorithms, and also show how our results can be applied to the problem of hyperparameter tuning (selecting the best value of k for the given learning problem).more » « less
-
We investigate the nonparametric, composite hypothesis testing problem for arbitrary unknown distributions in the asymptotic regime where both the sample size and the number of hypothesis grow exponentially large. Such asymptotic analysis is important in many practical problems, where the number of variations that can exist within a family of distributions can be countably infinite. We introduce the notion of discrimination capacity , which captures the largest exponential growth rate of the number of hypothesis relative to the sample size so that there exists a test with asymptotically vanishing probability of error. Our approach is based on various distributional distance metrics in order to incorporate the generative model of the data. We provide analyses of the error exponent using the maximum mean discrepancy and Kolmogorov–Smirnov distance and characterize the corresponding discrimination rates, i.e., lower bounds on the discrimination capacity, for these tests. Finally, an upper bound on the discrimination capacity based on Fano's inequality is developed. Numerical results are presented to validate the theoretical results.more » « less
-
Problem-solving is a typical type of assessment in engineering dynamics tests. To solve a problem, students need to set up equations and find a numerical answer. Depending on its difficulty and complexity, it can take anywhere from ten to thirty minutes to solve a quantitative problem. Due to the time constraint of in-class testing, a typical test may only contain a limited number of problems, covering an insufficient range of problem types. This can potentially reduce validity and reliability, two crucial factors which contribute to assessment results. A test with high validity should cover proper content. It should be able to distinguish high-performing students from low-performing students and every student in between. A reliable test should have a sufficient number of items to provide consistent information about students’ mastery of the materials. In this work-in-progress study, we will investigate to what extent a newly developed assessment is valid and reliable. Symbolic problem solving in this study refers to solving problems by setting up a system of equations without finding numeric solutions. Such problems usually take much less time. As a result, we can include more problems of a variety of types in a test. We evaluate the new assessment's validity and reliability. The efficient approach focused in symbolic problem-solving allows for a diverse range of problems in a single test. We will follow Standards for Educational and Psychological Testing, referred to as the Standards, for our study. The Standards were developed jointly by three professional organizations including the American Educational Research Association (AERA), the American Psychological Association (APA), and the National Council on Measurement in Education (NCME). We will use the standards to evaluate the content validity and internal consistency of a collection of symbolic problems. Examples on rectilinear kinematics and angular motion will be provided to illustrate how symbolic problem solving is used in both homework and assessments. Numerous studies in the literature have shown that symbolic questions impose greater challenges because of students’ algebraic difficulties. Thus, we will share strategies on how to prepare students to approach such problems.more » « less