skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Consistent Estimation of Identifiable Nonparametric MixtureModels from Grouped Observations
Recent research has established sufficient conditions for finite mixture models to be identifiablefrom grouped observations. These conditions allow the mixture components to be nonparametricand have substantial (or even total) overlap. This work proposes an algorithm that consistentlyestimates any identifiable mixture model from grouped observations. Our analysis leverages anoracle inequality for weighted kernel density estimators of the distribution on groups, togetherwith a general result showing that consistent estimation of the distribution on groups impliesconsistent estimation of mixture components. A practical implementation is provided for pairedobservations, and the approach is shown to outperform existing methods, especially when mixturecomponents overlap significantly.  more » « less
Award ID(s):
1838179
PAR ID:
10206637
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Advances in neural information processing systems
ISSN:
1049-5258
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Recent research has established sufficient conditions for finite mixture models to be identifiable from grouped observations. These conditions allow the mixture components to be nonparametric and have substantial (or even total) overlap. This work proposes an algorithm that consistently estimates any identifiable mixture model from grouped observations. Our analysis leverages an oracle inequality for weighted kernel density estimators of the distribution on groups, together with a general result showing that consistent estimation of the distribution on groups implies consistent estimation of mixture components. A practical implementation is provided for paired observations, and the approach is shown to outperform existing methods, especially when mixture components overlap significantly. 
    more » « less
  2. We consider nonparametric estimation of a mixed discrete‐continuous distribution under anisotropic smoothness conditions and a possibly increasing number of support points for the discrete part of the distribution. For these settings, we derive lower bounds on the estimation rates. Next, we consider a nonparametric mixture of normals model that uses continuous latent variables for the discrete part of the observations. We show that the posterior in this model contracts at rates that are equal to the derived lower bounds up to a log factor. Thus, Bayesian mixture of normals models can be used for (up to a log factor) optimal adaptive estimation of mixed discrete‐continuous distributions. The proposed model demonstrates excellent performance in simulations mimicking the first stage in the estimation of structural discrete choice models. 
    more » « less
  3. The task of mixture proportion estimation (MPE) is to estimate the weight of a component distribution in a mixture, given observations from both the component and mixture. Previous work on MPE adopts the irreducibility assumption, which ensures identifiablity of the mixture proportion. In this paper, we propose a more general sufficient condition that accommodates several settings of interest where irreducibility does not hold. We further present a resampling-based meta algorithm that takes any existing MPE algorithm designed to work under irreducibility and adapts it to work under our more general condition. Our approach empirically exhibits improved estimation performance relative to baseline methods and to a recently proposed regrouping-based algorithm. 
    more » « less
  4. Development of a flexible Erlang mixture model for survival analysis is introduced. The model for the survival density is built from a structured mixture of Erlang densities, mixing on the integer shape parameter with a common scale parameter. The mixture weights are constructed through increments of a distribution function on the positive real line, which is assigned a Dirichlet process prior. The model has a relatively simple structure, balancing flexibility with efficient posterior computation. Moreover, it implies a mixture representation for the hazard function that involves time-dependent mixture weights, thus offering a general approach to hazard estimation. Extension of the model is made to accommodate survival responses corresponding to multiple experimental groups, using a dependent Dirichlet process prior for the group-specific distributions that define the mixture weights. Model properties, prior specification, and posterior simulation are discussed, and the methodology is illustrated with synthetic and real data examples. 
    more » « less
  5. This article develops a Markov chain Monte Carlo (MCMC) method for a class of models that encompasses finite and countable mixtures of densities and mixtures of experts with a variable number of mixture components. The method is shown to maximize the expected probability of acceptance for cross-dimensional moves and to minimize the asymptotic variance of sample average estimators under certain restrictions. The method can be represented as a retrospective sampling algorithm with an optimal choice of auxiliary priors and as a reversible jump algorithm with optimal proposal distributions. The method is primarily motivated by and applied to a Bayesian nonparametric model for conditional densities based on mixtures of a variable number of experts. The mixture of experts model outperforms standard parametric and nonparametric alternatives in out of sample performance comparisons in an application to Engel curve estimation. The proposed MCMC algorithm makes estimation of this model practical. 
    more » « less