ABSTRACT Mediation analysis is widely utilized in neuroscience to investigate the role of brain image phenotypes in the neurological pathways from genetic exposures to clinical outcomes. However, it is still difficult to conduct mediation analyses with whole genome‐wide exposures and brain subcortical shape mediators due to several challenges including (i) large‐scale genetic exposures, that is, millions of single‐nucleotide polymorphisms (SNPs); (ii) nonlinear Hilbert space for shape mediators; and (iii) statistical inference on the direct and indirect effects. To tackle these challenges, this paper proposes a genome‐wide mediation analysis framework with brain subcortical shape mediators. First, to address the issue caused by the high dimensionality in genetic exposures, a fast genome‐wide association analysis is conducted to discover potential genetic variants with significant genetic effects on the clinical outcome. Second, the square‐root velocity function representations are extracted from the brain subcortical shapes, which fall in an unconstrained linear Hilbert subspace. Third, to identify the underlying causal pathways from the detected SNPs to the clinical outcome implicitly through the shape mediators, we utilize a shape mediation analysis framework consisting of a shape‐on‐scalar model and a scalar‐on‐shape model. Furthermore, the bootstrap resampling approach is adopted to investigate both global and spatial significant mediation effects. Finally, our framework is applied to the corpus callosum shape data from the Alzheimer's Disease Neuroimaging Initiative.
more »
« less
Sequential pathway inference for multimodal neuroimaging analysis
Motivated by a multimodal neuroimaging study for Alzheimer's disease, in this article, we study the inference problem, that is, hypothesis testing, of sequential mediation analysis. The existing sequential mediation solutions mostly focus on sparse estimation, while hypothesis testing is an utterly different and more challenging problem. Meanwhile, the few mediation testing solutions often ignore the potential dependency among the mediators or cannot be applied to the sequential problem directly. We propose a statistical inference procedure to test mediation pathways when there are sequentially ordered multiple data modalities and each modality involves multiple mediators. We allow the mediators to be conditionally dependent and the number of mediators within each modality to diverge with the sample size. We produce the explicit significance quantification and establish theoretical guarantees in terms of asymptotic size, power, and false discovery control. We demonstrate the efficacy of the method through both simulations and an application to a multimodal neuroimaging pathway analysis of Alzheimer's disease.
more »
« less
- Award ID(s):
- 2102227
- PAR ID:
- 10380534
- Publisher / Repository:
- Wiley Blackwell (John Wiley & Sons)
- Date Published:
- Journal Name:
- Stat
- Volume:
- 11
- Issue:
- 1
- ISSN:
- 2049-1573
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the problem of sequential multiple hypothesis testing with nontrivial data collection costs. This problem appears, for example, when conducting biological experiments to identify differentially expressed genes of a disease process. This work builds on the generalized α-investing framework which enables control of the marginal false discovery rate in a sequential testing setting. We make a theoretical analysis of the long term asymptotic behavior of α-wealth which motivates a consideration of sample size in the α-investing decision rule. Posing the testing process as a game with nature, we construct a decision rule that optimizes the expected α-wealth reward (ERO) and provides an optimal sample size for each test. Empirical results show that a cost-aware ERO decision rule correctly rejects more false null hypotheses than other methods for $n=1$ where n is the sample size. When the sample size is not fixed cost-aware ERO uses a prior on the null hypothesis to adaptively allocate of the sample budget to each test. We extend cost-aware ERO investing to finite-horizon testing which enables the decision rule to allocate samples in a non-myopic manner. Finally, empirical tests on real data sets from biological experiments show that cost-aware ERO balances the allocation of samples to an individual test against the allocation of samples across multiple tests.more » « less
-
Abstract Causal mediation analysis aims to examine the role of a mediator or a group of mediators that lie in the pathway between an exposure and an outcome. Recent biomedical studies often involve a large number of potential mediators based on high‐throughput technologies. Most of the current analytic methods focus on settings with one or a moderate number of potential mediators. With the expanding growth of ‐omics data, joint analysis of molecular‐level genomics data with epidemiological data through mediation analysis is becoming more common. However, such joint analysis requires methods that can simultaneously accommodate high‐dimensional mediators and that are currently lacking. To address this problem, we develop a Bayesian inference method using continuous shrinkage priors to extend previous causal mediation analysis techniques to a high‐dimensional setting. Simulations demonstrate that our method improves the power of global mediation analysis compared to simpler alternatives and has decent performance to identify true nonnull contributions to the mediation effects of the pathway. The Bayesian method also helps us to understand the structure of the composite null cases for inactive mediators in the pathway. We applied our method to Multi‐Ethnic Study of Atherosclerosis and identified DNA methylation regions that may actively mediate the effect of socioeconomic status on cardiometabolic outcomes.more » « less
-
Abstract Mediation hypothesis testing for a large number of mediators is challenging due to the composite structure of the null hypothesis, (: effect of the exposure on the mediator after adjusting for confounders; : effect of the mediator on the outcome after adjusting for exposure and confounders). In this paper, we reviewed three classes of methods for large‐scale one at a time mediation hypothesis testing. These methods are commonly used for continuous outcomes and continuous mediators assuming there is no exposure‐mediator interaction so that the product has a causal interpretation as the indirect effect. The first class of methods ignores the impact of different structures under the composite null hypothesis, namely, (1) ; (2) ; and (3) . The second class of methods weights the reference distribution under each case of the null to form a mixture reference distribution. The third class constructs a composite test statistic using the threepvalues obtained under each case of the null so that the reference distribution of the composite statistic is approximately . In addition to these existing methods, we developed the Sobel‐comp method belonging to the second class, which uses a corrected mixture reference distribution for Sobel's test statistic. We performed extensive simulation studies to compare all six methods belonging to these three classes in terms of the false positive rates (FPRs) under the null hypothesis and the true positive rates under the alternative hypothesis. We found that the second class of methods which uses a mixture reference distribution could best maintain the FPRs at the nominal level under the null hypothesis and had the greatest true positive rates under the alternative hypothesis. We applied all methods to study the mediation mechanism of DNA methylation sites in the pathway from adult socioeconomic status to glycated hemoglobin level using data from the Multi‐Ethnic Study of Atherosclerosis (MESA). We provide guidelines for choosing the optimal mediation hypothesis testing method in practice and develop an R packagemedScanavailable on the CRAN for implementing all the six methods.more » « less
-
Predicting progressions of cognitive outcomes via high-order multi-modal multi-task feature learningMany existing studies on complex brain disorders, such as Alzheimer's Disease, usually employed regression analysis to associate the neuroimaging measures to cognitive status. However, whether these measures in multiple modalities have the predictive power to infer the trajectory of cognitive performance over time still remain under-explored. In this paper, we propose a high-order multi-modal multi-mask feature learning model to uncover temporal relationship between the longitudinal neuroimaging measures and progressive cognitive output scores. The regularizations through sparsity-induced norms implemented in the proposed learning model enable the selection of only a small number of imaging features over time and capture modality structures for multi-modal imaging markers. The promising experimental results in extensive empirical studies performed on the ADNI cohort have validated the effectiveness of the proposed method.more » « less
An official website of the United States government
