Development of a flexible Erlang mixture model for survival analysis is introduced. The model for the survival density is built from a structured mixture of Erlang densities, mixing on the integer shape parameter with a common scale parameter. The mixture weights are constructed through increments of a distribution function on the positive real line, which is assigned a Dirichlet process prior. The model has a relatively simple structure, balancing flexibility with efficient posterior computation. Moreover, it implies a mixture representation for the hazard function that involves time-dependent mixture weights, thus offering a general approach to hazard estimation. Extension of the model is made to accommodate survival responses corresponding to multiple experimental groups, using a dependent Dirichlet process prior for the group-specific distributions that define the mixture weights. Model properties, prior specification, and posterior simulation are discussed, and the methodology is illustrated with synthetic and real data examples.
more »
« less
Erlang mixture modeling for Poisson process intensities
Abstract We develop a prior probability model for temporal Poisson process intensities through structured mixtures of Erlang densities with common scale parameter, mixing on the integer shape parameters. The mixture weights are constructed through increments of a cumulative intensity function which is modeled nonparametrically with a gamma process prior. Such model specification provides a novel extension of Erlang mixtures for density estimation to the intensity estimation setting. The prior model structure supports general shapes for the point process intensity function, and it also enables effective handling of the Poisson process likelihood normalizing term resulting in efficient posterior simulation. The Erlang mixture modeling approach is further elaborated to develop an inference method for spatial Poisson processes. The methodology is examined relative to existing Bayesian nonparametric modeling approaches, including empirical comparison with Gaussian process prior based models, and is illustrated with synthetic and real data examples.
more »
« less
- Award ID(s):
- 1950902
- PAR ID:
- 10363460
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- Statistics and Computing
- Volume:
- 32
- Issue:
- 1
- ISSN:
- 0960-3174
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)We extend network tomography to traffic flows that are not necessarily Poisson random processes. This assumption has governed the field since its inception in 1996 by Y. Vardi. We allow the distribution of the packet count of each traffic flow in a given time interval to be a mixture of Poisson random variables. Both discrete as well as continuous mixtures are studied. For the latter case, we focus on mixed Poisson distributions with Gamma mixing distribution. As is well known, this mixed Poisson distribution is the negative binomial distribution. Other mixing distributions, such as Wald or the inverse Gaussian distribution can be used. Mixture distributions are overdispersed with variance larger than the mean. Thus, they are more suitable for Internet traffic than the Poisson model. We develop a second-order moment matching approach for estimating the mean traffic rate for each source-destination pair using least squares and the minimum I-divergence iterative procedure. We demonstrate the performance of the proposed approach by several numerical examples. The results show that the averaged normalized mean squared error in rate estimation is of the same order as in the classic Poisson based network tomography. Furthermore, no degradation in performance was observed when traffic rates are Poisson but Poisson mixtures are assumed.more » « less
-
Abstract The simultaneous testing of multiple hypotheses is common to the analysis of high‐dimensional data sets. The two‐group model, first proposed by Efron, identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two‐group model framework. Here, we investigate employing mixtures of two‐parameter Poisson‐Dirichlet Processes instead, and show how they provide a more flexible and effective tool for large‐scale hypothesis testing. Our model further employs nonlocal prior densities to allow separation between the two mixture components. We obtain a closed‐form expression for the exchangeable partition probability function of the two‐group model, which leads to a straightforward Markov Chain Monte Carlo implementation. We compare the performance of our method for large‐scale inference in a simulation study and illustrate its use on both a prostate cancer data set and a case‐control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate‐to‐severe diarrhea.more » « less
-
Abstract The field of forensic statistics offers a unique hierarchical data structure in which a population is composed of several subpopulations of sources and a sample is collected from each source. This subpopulation structure creates an additional layer of complexity. Hence, the data has a hierarchical structure in addition to the existence of underlying subpopulations. Finite mixtures are known for modeling heterogeneity; however, previous parameter estimation procedures assume that the data is generated through a simple random sampling process. We propose using a semi‐supervised mixture modeling approach to model the subpopulation structure which leverages the fact that we know the collection of samples came from the same source, yet an unknown subpopulation. A simulation study and a real data analysis based on famous glass datasets and a keystroke dynamic typing data set show that the proposed approach performs better than other approaches that have been used previously in practice.more » « less
-
null (Ed.)Bayesian non-parametric (BNP) modeling has been developed and proven to be a powerful tool to analyze messy data with complex structures. Despite the increasing popularity of BNP modeling, it also faces challenges. One challenge is the estimation of the precision parameter in the Dirichlet process mixtures. In this study, we focus on a BNP growth curve model and investigate how non-informative prior, weakly informative prior, accurate informative prior, and inaccurate informative prior affect the model convergence, parameter estimation, and computation time. A simulation study has been conducted. We conclude that the non-informative prior for the precision parameter is less preferred because it yields a much lower convergence rate, and growth curve parameter estimates are not sensitive to informative priors.more » « less