High‐dimensional multinomial regression models are very useful in practice but have received less research attention than logistic regression models, especially from the perspective of statistical inference. In this work, we analyze the estimation and prediction error of the contrast‐based ‐penalized multinomial regression model and extend the debiasing method to the multinomial case, providing a valid confidence interval for each coefficient and value of the individual hypothesis test. We also examine cases of model misspecification and non‐identically distributed data to demonstrate the robustness of our method when some assumptions are violated. We apply the debiasing method to identify important predictors in the progression into dementia of different subtypes. Results from extensive simulations show the superiority of the debiasing method compared to other inference methods.
more »
« less
A Unified Bayesian Framework for Modeling Measurement Error in Multinomial Data
Measurement error in multinomial data is a well-known and well-studied inferential problem that is encountered in many fields, including engineering, biomedical and omics research, ecology, finance, official statistics, and social sciences. Methods developed to accommodate measurement error in multinomial data are typically equipped to handle false negatives or false positives, but not both. We provide a unified framework for accommodating both forms of measurement error using a Bayesian hierarchical approach. We demonstrate the proposed method’s performance on simulated data and apply it to acoustic bat monitoring and official crime data.
more »
« less
- Award ID(s):
- 2245492
- PAR ID:
- 10596191
- Publisher / Repository:
- International Society for Bayesian Analysis
- Date Published:
- Journal Name:
- Bayesian Analysis
- ISSN:
- 1931-6690
- Subject(s) / Keyword(s):
- categorical data, criminology, ecology, misclassification, record linkage, zero-inflation.
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
An accurate non-model-based method for delamination identification of laminated composite plates is proposed in this work. A weighted mode shape damage index is formulated using squared weighted difference between a measured mode shape of a composite plate with delamination and one from a polynomial that fits the measured mode shape of the composite plate with a proper order. Weighted mode shape damage indices associated with at least two measured mode shapes of the same mode are synthesized to formulate a synthetic mode shape damage index to exclude some false positive identification results due to measurement noise and error. An auxiliary mode shape damage index is proposed to further assist delamination identification, by which some false negative identification results can be excluded and edges of a delamination area can be accurately and completely identified. Both numerical and experimental examples are presented to investigate effectiveness of the proposed method, and it is shown that edges of a delamination area in composite plates can be accurately and completely identified when measured mode shapes are contaminated by measurement noise and error. In the experimental example, identification results of a composite plate with delamination from the proposed method are validated by its C-scan image.more » « less
-
ASTRACT Brain-effective connectivity analysis quantifies directed influence of one neural element or region over another, and it is of great scientific interest to understand how effective connectivity pattern is affected by variations of subject conditions. Vector autoregression (VAR) is a useful tool for this type of problems. However, there is a paucity of solutions when there is measurement error, when there are multiple subjects, and when the focus is the inference of the transition matrix. In this article, we study the problem of transition matrix inference under the high-dimensional VAR model with measurement error and multiple subjects. We propose a simultaneous testing procedure, with three key components: a modified expectation-maximization (EM) algorithm, a test statistic based on the tensor regression of a bias-corrected estimator of the lagged auto-covariance given the covariates, and a properly thresholded simultaneous test. We establish the uniform consistency for the estimators of our modified EM, and show that the subsequent test achieves both a consistent false discovery control, and its power approaches one asymptotically. We demonstrate the efficacy of our method through both simulations and a brain connectivity study of task-evoked functional magnetic resonance imaging.more » « less
-
Abstract The Event Horizon Telescope has released polarized images of the supermassive black holes Messier 87* (M87*) and Sagittarius A* accretion disks. As more images are produced, our understanding of the average polarized emission from near the event horizon improves. In this Letter, we use a semianalytic model for optically thin, equatorial emission near a Kerr black hole to study how spin constraints follow from measurements of the average polarization spiral pitch angle. We focus on the case of M87* and explore how the direct, weakly lensed image spiral is coupled to the strongly lensed indirect image spiral, and how a precise measurement of both provides a powerful spin tracer. We find a generic result that the spin twists the direct and indirect image polarization in opposite directions. Using a grid search over model parameters, we find a strong dependence of the resulting spin constraint on plasma properties near the horizon. Grid constraints suggest that, under reasonable assumptions for the accretion disk, a measurement of the direct and indirect image spiral pitch angles to ±5° yields a dimensionless spin amplitude measurement with uncertainty for radially infalling models but otherwise provides only weak constraints; an error of 1∘can reach . We also find that a well-constrained rotation measure greatly improves spin measurements. Assuming that equatorial velocity and magnetic field are oppositely oriented, we find that the observed M87* polarization pattern favors models with strong radial velocity components, which are close to optimal for future spin measurements.more » « less
-
Abstract Molecular ecology regularly requires the analysis of count data that reflect the relative abundance of features of a composition (e.g., taxa in a community, gene transcripts in a tissue). The sampling process that generates these data can be modelled using the multinomial distribution. Replicate multinomial samples inform the relative abundances of features in an underlying Dirichlet distribution. These distributions together form a hierarchical model for relative abundances among replicates and sampling groups. This type of Dirichlet‐multinomial modelling (DMM) has been described previously, but its benefits and limitations are largely untested. With simulated data, we quantified the ability of DMM to detect differences in proportions between treatment and control groups, and compared the efficacy of three computational methods to implement DMM—Hamiltonian Monte Carlo (HMC), variational inference (VI), and Gibbs Markov chain Monte Carlo. We report that DMM was better able to detect shifts in relative abundances than analogous analytical tools, while identifying an acceptably low number of false positives. Among methods for implementing DMM, HMC provided the most accurate estimates of relative abundances, and VI was the most computationally efficient. The sensitivity of DMM was exemplified through analysis of previously published data describing lung microbiomes. We report that DMM identified several potentially pathogenic, bacterial taxa as more abundant in the lungs of children who aspirated foreign material during swallowing; these differences went undetected with different statistical approaches. Our results suggest that DMM has strong potential as a statistical method to guide inference in molecular ecology.more » « less
An official website of the United States government

