NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Bayesian Zero-Inflated Dirichlet-Multinomial Regression Model for Multivariate Compositional Count Data

https://doi.org/10.1111/biom.13853

Koslovsky, Matthew D. (March 2023, Biometrics)

Abstract The Dirichlet-multinomial (DM) distribution plays a fundamental role in modern statistical methodology development and application. Recently, the DM distribution and its variants have been used extensively to model multivariate count data generated by high-throughput sequencing technology in omics research due to its ability to accommodate the compositional structure of the data as well as overdispersion. A major limitation of the DM distribution is that it is unable to handle excess zeros typically found in practice which may bias inference. To fill this gap, we propose a novel Bayesian zero-inflated DM model for multivariate compositional count data with excess zeros. We then extend our approach to regression settings and embed sparsity-inducing priors to perform variable selection for high-dimensional covariate spaces. Throughout, modeling decisions are made to boost scalability without sacrificing interpretability or imposing limiting assumptions. Extensive simulations and an application to a human gut microbiome dataset are presented to compare the performance of the proposed method to existing approaches. We provide an accompanying R package with a user-friendly vignette to apply our method to other datasets.
more » « less
Analyzing microbiome data with taxonomic misclassification using a zero-inflated Dirichlet-multinomial model

https://doi.org/10.1186/s12859-025-06078-4

Koslovsky, Matthew_D (February 2025, BMC Bioinformatics)
A Unified Bayesian Framework for Modeling Measurement Error in Multinomial Data

Koslovsky, MD; Kaplan, A; Terranova, VA; Mevin, MB (October 2024, Bayesian Analysis)

Measurement error in multinomial data is a well-known and well-studied inferential problem that is encountered in many fields, including engineering, biomedical and omics research, ecology, finance, official statistics, and social sciences. Methods developed to accommodate measurement error in multinomial data are typically equipped to handle false negatives or false positives, but not both. We provide a unified framework for accommodating both forms of measurement error using a Bayesian hierarchical approach. We demonstrate the proposed method’s performance on simulated data and apply it to acoustic bat monitoring and official crime data.
more » « less
Full Text Available

Search for: All records