Abstract We develop a prior probability model for temporal Poisson process intensities through structured mixtures of Erlang densities with common scale parameter, mixing on the integer shape parameters. The mixture weights are constructed through increments of a cumulative intensity function which is modeled nonparametrically with a gamma process prior. Such model specification provides a novel extension of Erlang mixtures for density estimation to the intensity estimation setting. The prior model structure supports general shapes for the point process intensity function, and it also enables effective handling of the Poisson process likelihood normalizing term resulting in efficient posterior simulation. The Erlang mixture modeling approach is further elaborated to develop an inference method for spatial Poisson processes. The methodology is examined relative to existing Bayesian nonparametric modeling approaches, including empirical comparison with Gaussian process prior based models, and is illustrated with synthetic and real data examples.
more »
« less
Two‐group Poisson‐Dirichlet mixtures for multiple testing
Abstract The simultaneous testing of multiple hypotheses is common to the analysis of high‐dimensional data sets. The two‐group model, first proposed by Efron, identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two‐group model framework. Here, we investigate employing mixtures of two‐parameter Poisson‐Dirichlet Processes instead, and show how they provide a more flexible and effective tool for large‐scale hypothesis testing. Our model further employs nonlocal prior densities to allow separation between the two mixture components. We obtain a closed‐form expression for the exchangeable partition probability function of the two‐group model, which leads to a straightforward Markov Chain Monte Carlo implementation. We compare the performance of our method for large‐scale inference in a simulation study and illustrate its use on both a prostate cancer data set and a case‐control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate‐to‐severe diarrhea.
more »
« less
- Award ID(s):
- 1659921
- PAR ID:
- 10451041
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Biometrics
- Volume:
- 77
- Issue:
- 2
- ISSN:
- 0006-341X
- Format(s):
- Medium: X Size: p. 622-633
- Size(s):
- p. 622-633
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Development of a flexible Erlang mixture model for survival analysis is introduced. The model for the survival density is built from a structured mixture of Erlang densities, mixing on the integer shape parameter with a common scale parameter. The mixture weights are constructed through increments of a distribution function on the positive real line, which is assigned a Dirichlet process prior. The model has a relatively simple structure, balancing flexibility with efficient posterior computation. Moreover, it implies a mixture representation for the hazard function that involves time-dependent mixture weights, thus offering a general approach to hazard estimation. Extension of the model is made to accommodate survival responses corresponding to multiple experimental groups, using a dependent Dirichlet process prior for the group-specific distributions that define the mixture weights. Model properties, prior specification, and posterior simulation are discussed, and the methodology is illustrated with synthetic and real data examples.more » « less
-
How to cluster event sequences generated via different point processes is an interesting and important problem in statistical machine learning. To solve this problem, we propose and discuss an effective model-based clustering method based on a novel Dirichlet mixture model of a special but significant type of point processes — Hawkes process. The proposed model generates the event sequences with different clusters from the Hawkes processes with different parameters, and uses a Dirichlet distribution as the prior distribution of the clusters. We prove the identifiability of our mixture model and propose an effective variational Bayesian inference algorithm to learn our model. An adaptive inner iteration allocation strategy is designed to accelerate the convergence of our algorithm. Moreover, we investigate the sample complexity and the computational complexity of our learning algorithm in depth. Experiments on both synthetic and real-world data show that the clustering method based on our model can learn structural triggering patterns hidden in asynchronous event sequences robustly and achieve superior performance on clustering purity and consistency compared to existing methods.more » « less
-
Dimension constraints improve hypothesis testing for large-scale, graph-associated, brain-image datanull (Ed.)Summary For large-scale testing with graph-associated data, we present an empirical Bayes mixture technique to score local false-discovery rates (FDRs). Compared to procedures that ignore the graph, the proposed Graph-based Mixture Model (GraphMM) method gains power in settings where non-null cases form connected subgraphs, and it does so by regularizing parameter contrasts between testing units. Simulations show that GraphMM controls the FDR in a variety of settings, though it may lose control with excessive regularization. On magnetic resonance imaging data from a study of brain changes associated with the onset of Alzheimer’s disease, GraphMM produces greater yield than conventional large-scale testing procedures.more » « less
-
Summary We propose a novel class of dynamic shrinkage processes for Bayesian time series and regression analysis. Building on a global–local framework of prior construction, in which continuous scale mixtures of Gaussian distributions are employed for both desirable shrinkage properties and computational tractability, we model dependence between the local scale parameters. The resulting processes inherit the desirable shrinkage behaviour of popular global–local priors, such as the horseshoe prior, but provide additional localized adaptivity, which is important for modelling time series data or regression functions with local features. We construct a computationally efficient Gibbs sampling algorithm based on a Pólya–gamma scale mixture representation of the process proposed. Using dynamic shrinkage processes, we develop a Bayesian trend filtering model that produces more accurate estimates and tighter posterior credible intervals than do competing methods, and we apply the model for irregular curve fitting of minute-by-minute Twitter central processor unit usage data. In addition, we develop an adaptive time varying parameter regression model to assess the efficacy of the Fama–French five-factor asset pricing model with momentum added as a sixth factor. Our dynamic analysis of manufacturing and healthcare industry data shows that, with the exception of the market risk, no other risk factors are significant except for brief periods.more » « less