Motivated by a multimodal neuroimaging study for Alzheimer's disease, in this article, we study the inference problem, that is, hypothesis testing, of sequential mediation analysis. The existing sequential mediation solutions mostly focus on sparse estimation, while hypothesis testing is an utterly different and more challenging problem. Meanwhile, the few mediation testing solutions often ignore the potential dependency among the mediators or cannot be applied to the sequential problem directly. We propose a statistical inference procedure to test mediation pathways when there are sequentially ordered multiple data modalities and each modality involves multiple mediators. We allow the mediators to be conditionally dependent and the number of mediators within each modality to diverge with the sample size. We produce the explicit significance quantification and establish theoretical guarantees in terms of asymptotic size, power, and false discovery control. We demonstrate the efficacy of the method through both simulations and an application to a multimodal neuroimaging pathway analysis of Alzheimer's disease.
While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set—‘inferring the uncertainty of model parameters on the basis of observations’—has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. Extended Fiducial inference involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing. Specifically, EFI provides higher fidelity in parameter estimation, especially when outliers are present in the observations; and eliminates the need for theoretical reference distributions in hypothesis testing, thereby automating the statistical inference process. Extended Fiducial inference also provides an innovative framework for semisupervised learning.
more » « less- NSF-PAR ID:
- 10530786
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Journal of the Royal Statistical Society Series B: Statistical Methodology
- ISSN:
- 1369-7412
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Quantiles and expected shortfalls are commonly used risk measures in financial risk management. The two measurements are correlated while having distinguished features. In this project, our primary goal is to develop a stable and practical inference method for the conditional expected shortfall. We consider the joint modelling of conditional quantile and expected shortfall to facilitate the statistical inference procedure. While the regression coefficients can be estimated jointly by minimizing a class of strictly consistent joint loss functions, the computation is challenging, especially when the dimension of parameters is large since the loss functions are neither differentiable nor convex. We propose a two‐step estimation procedure to reduce the computational effort by first estimating the quantile regression parameters with standard quantile regression. We show that the two‐step estimator has the same asymptotic properties as the joint estimator, but the former is numerically more efficient. We develop a score‐type inference method for hypothesis testing and confidence interval construction. Compared to the Wald‐type method, the score method is robust against heterogeneity and is superior in finite samples, especially for cases with many confounding factors. The advantages of our proposed method over existing approaches are demonstrated by simulations and empirical studies based on income and college education data.
-
Ruiz, F. ; Dy, J. ; Meent, J.-W. (Ed.)Prediction algorithms, such as deep neural networks (DNNs), are used in many domain sciences to directly estimate internal parameters of interest in simulator-based models, especially in settings where the observations include images or complex high-dimensional data. In parallel, modern neural density estimators, such as normalizing flows, are becoming increasingly popular for uncertainty quantification, especially when both parameters and observations are high-dimensional. However, parameter inference is an inverse problem and not a prediction task; thus, an open challenge is to construct conditionally valid and precise confidence regions, with a guaranteed probability of covering the true parameters of the data-generating process, no matter what the (unknown) parameter values are, and without relying on large-sample theory. Many simulator-based inference (SBI) methods are indeed known to produce biased or overly con- fident parameter regions, yielding misleading uncertainty estimates. This paper presents WALDO, a novel method to construct confidence regions with finite-sample conditional validity by leveraging prediction algorithms or posterior estimators that are currently widely adopted in SBI. WALDO reframes the well-known Wald test statistic, and uses a computationally efficient regression-based machinery for classical Neyman inversion of hypothesis tests. We apply our method to a recent high-energy physics problem, where prediction with DNNs has previously led to estimates with prediction bias. We also illustrate how our approach can correct overly confident posterior regions computed with normalizing flows.more » « less
-
Samejima’s graded response model (GRM) has gained popularity in the analyses of ordinal response data in psychological, educational, and health-related assessment.Obtaining high-quality point and interval estimates for GRM parameters attracts a great deal of attention in the literature. In the current work, we derive generalized fiducial inference (GFI) for a family of multidimensional graded response model, implement a Gibbs sampler to perform fiducial estimation, and compare its finite-sample performance with several commonly used likelihood-based and Bayesian approaches via three simulation studies. It is found that the proposed method is able to yield reliable inference even in the presence of small sample size and extreme generating parameter values, outperforming the other candidate methods under investigation. The use of GFI as a convenient tool to quantify sampling variability in various inferential procedures is illustrated by an empirical data analysis using the patient-reported emotional distress data.more » « less
-
The weighted nearest neighbors (WNN) estimator has been popularly used as a flexible and easy-to-implement nonparametric tool for mean regression estimation. The bagging technique is an elegant way to form WNN estimators with weights automatically generated to the nearest neighbors (Steele, 2009; Biau et al., 2010); we name the resulting estimator as the distributional nearest neighbors (DNN) for easy reference. Yet, there is a lack of distributional results for such estimator, limiting its application to statistical inference. Moreover, when the mean regression function has higher-order smoothness, DNN does not achieve the optimal nonparametric convergence rate, mainly because of the bias issue. In this work, we provide an in-depth technical analysis of the DNN, based on which we suggest a bias reduction approach for the DNN estimator by linearly combining two DNN estimators with different subsampling scales, resulting in the novel two-scale DNN (TDNN) estimator. The two-scale DNN estimator has an equivalent representation of WNN with weights admitting explicit forms and some being negative. We prove that, thanks to the use of negative weights, the two-scale DNN estimator enjoys the optimal nonparametric rate of convergence in estimating the regression function under the fourth order smoothness condition. We further go beyond estimation and establish that the DNN and two-scale DNN are both asymptotically normal as the subsampling scales and sample size diverge to infinity. For the practical implementation, we also provide variance estimators and a distribution estimator using the jackknife and bootstrap techniques for the two-scale DNN. These estimators can be exploited for constructing valid confidence intervals for nonparametric inference of the regression function. The theoretical results and appealing nite-sample performance of the suggested two-scale DNN method are illustrated with several simulation examples and a real data application.more » « less