skip to main content


Title: Consistent Estimation of Identifiable Nonparametric Mixture Models from Grouped Observations
Recent research has established sufficient conditions for finite mixture models to be identifiable from grouped observations. These conditions allow the mixture components to be nonparametric and have substantial (or even total) overlap. This work proposes an algorithm that consistently estimates any identifiable mixture model from grouped observations. Our analysis leverages an oracle inequality for weighted kernel density estimators of the distribution on groups, together with a general result showing that consistent estimation of the distribution on groups implies consistent estimation of mixture components. A practical implementation is provided for paired observations, and the approach is shown to outperform existing methods, especially when mixture components overlap significantly.  more » « less
Award ID(s):
2008074
NSF-PAR ID:
10281838
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
34th Conferene on Neural Information Processing Systems (NeurIPS 2020)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Recent research has established sufficient conditions for finite mixture models to be identifiablefrom grouped observations. These conditions allow the mixture components to be nonparametricand have substantial (or even total) overlap. This work proposes an algorithm that consistentlyestimates any identifiable mixture model from grouped observations. Our analysis leverages anoracle inequality for weighted kernel density estimators of the distribution on groups, togetherwith a general result showing that consistent estimation of the distribution on groups impliesconsistent estimation of mixture components. A practical implementation is provided for pairedobservations, and the approach is shown to outperform existing methods, especially when mixturecomponents overlap significantly. 
    more » « less
  2. Abstract

    Precipitation in the outer tropical Andes is highly seasonal, exhibits considerable interannual variability, and is vital for regulating freshwater availability, flooding, glacier mass balance, and droughts. The primary driver of interannual variability is El Niño Southern Oscillation (ENSO), with most investigations reporting that the El Niño (La Niña) results in negative (positive) precipitation anomalies across the region. Recent investigations, however, have identified substantial spatiotemporal differences in ENSO‐precipitation relationships. Motivated by the dissimilarity of these findings, this study examines a carefully selected data set (≥ 90% completeness) of ground‐based precipitation observations from 75 high‐elevation (≥ 2,500 m above sea level) meteorological stations in the tropical Andes of southern Peru and Bolivia for the period 1972–2016. Distinct groups of stations and associated variability in precipitation characteristics (e.g., total seasonal precipitation, wet season onset, and wet season length) are identified. Using no spatial constraints, the K‐Means algorithm optimally grouped stations into five easily identifiable groups. The groups farthest from the Amazon basin had significant negative (positive) precipitation anomalies (p < .05) during El Niño (La Niña), aligning with the traditional view of ENSO‐precipitation relationships while groups closest to the Amazon had opposite relationships. Additionally, though studies have reported delays in the wet season, years characterized by El Niño had an earlier wet season onset in all five groups. These findings may aid in improving seasonal climate prediction and managing water resources, and could allow for improved interpretation of tropical Andean ice cores.

     
    more » « less
  3. null (Ed.)
    Many daily tasks involve the collaboration of both hands. Humans dexterously adjust hand poses and modulate the forces exerted by fingers in response to task demands. Hand pose selection has been intensively studied in unimanual tasks, but little work has investigated bimanual tasks. This work examines hand poses selection in a bimanual high-precision-screwing task taken from watchmaking. Twenty right-handed subjects dismounted a screw on the watch face with a screwdriver in two conditions. Results showed that although subjects used similar hand poses across steps within the same experimental conditions, the hand poses differed significantly in the two conditions. In the free-base condition, subjects needed to stabilize the watch face on the table. The role distribution across hands was strongly influenced by hand dominance: the dominant hand manipulated the tool, whereas the nondominant hand controlled the additional degrees of freedom that might impair performance. In contrast, in the fixed-base condition, the watch face was stationary. Subjects used both hands even though single hand would have been sufficient. Importantly, hand poses decoupled the control of task-demanded force and torque across hands through virtual fingers that grouped multiple fingers into functional units. This preference for bimanual over unimanual control strategy could be an effort to reduce variability caused by mechanical couplings and to alleviate intrinsic sensorimotor processing burdens. To afford analysis of this variety of observations, a novel graphical matrix-based representation of the distribution of hand pose combinations was developed. Atypical hand poses that are not documented in extant hand taxonomies are also included. NEW & NOTEWORTHY We study hand poses selection in bimanual fine motor skills. To understand how roles and control variables are distributed across the hands and fingers, we compared two conditions when unscrewing a screw from a watch face. When the watch face needed positioning, role distribution was strongly influenced by hand dominance; when the watch face was stationary, a variety of hand pose combinations emerged. Control of independent task demands is distributed either across hands or across distinct groups of fingers. 
    more » « less
  4. Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable, by introducing a novel framework involving clustering overfitted parametric (i.e. misspecified) mixture models. These identifiability conditions generalize existing conditions in the literature, and are flexible enough to include for example mixtures of Gaussian mixtures. In contrast to the recent literature on estimating nonparametric mixtures, we allow for general nonparametric mixture components, and instead impose regularity assumptions on the underlying mixing measure. As our primary application, we apply these results to partition-based clustering, generalizing the notion of a Bayes optimal partition from classical parametric model-based clustering to nonparametric settings. Furthermore, this framework is constructive so that it yields a practical algorithm for learning identified mixtures, which is illustrated through several examples on real data. The key conceptual device in the analysis is the convex, metric geometry of probability measures on metric spaces and its connection to the Wasserstein convergence of mixing measures. The result is a flexible framework for nonparametric clustering with formal consistency guarantees. 
    more » « less
  5. Abstract. This study presents a characterization of the hygroscopic growth behaviour and effects of different inorganic seed particles on the formation of secondary organic aerosols (SOAs) from the dark ozone-initiated oxidation of isoprene at low NOx conditions. We performed simulations of isoprene oxidation using a gas-phase chemical reaction mechanism based onthe Master Chemical Mechanism (MCM) in combination with an equilibriumgas–particle partitioning model to predict the SOA concentration. Theequilibrium model accounts for non-ideal mixing in liquid phases, includingliquid–liquid phase separation (LLPS), and is based on the AIOMFAC (Aerosol Inorganic–Organic Mixtures Functional groups Activity Coefficients) model for mixture non-ideality and the EVAPORATION (Estimation of VApour Pressure of ORganics, Accounting for Temperature,Intramolecular, and Non-additivity effects) model for pure compound vapourpressures. Measurements from the Cosmics Leaving Outdoor Droplets (CLOUD)chamber experiments, conducted at the European Organization for NuclearResearch (CERN) for isoprene ozonolysis cases, were used to aid inparameterizing the SOA yields at different atmospherically relevanttemperatures, relative humidity (RH), and reacted isoprene concentrations. To represent the isoprene-ozonolysis-derived SOA, a selection of organicsurrogate species is introduced in the coupled modelling system. The modelpredicts a single, homogeneously mixed particle phase at all relativehumidity levels for SOA formation in the absence of any inorganic seedparticles. In the presence of aqueous sulfuric acid or ammonium bisulfateseed particles, the model predicts LLPS to occur below ∼ 80 % RH, where the particles consist of an inorganic-rich liquid phase andan organic-rich liquid phase; however, this includes significant amounts of bisulfate and water partitioned to the organic-rich phase. The measurements show an enhancement in the SOA amounts at 85 % RH, compared to 35 % RH, for both the seed-free and seeded cases. The model predictions of RH-dependent SOA yield enhancements at 85 % RH vs. 35 % RH are 1.80 for a seed-free case, 1.52 for the case with ammonium bisulfate seed, and 1.06 for the case with sulfuric acid seed. Predicted SOA yields are enhanced in the presence of an aqueous inorganic seed, regardless of the seed type (ammonium sulfate, ammonium bisulfate, or sulfuric acid) in comparison with seed-free conditions at the same RH level. We discuss the comparison of model-predicted SOA yields with a selection of other laboratory studies on isoprene SOA formation conducted at different temperatures and for a variety of reacted isoprene concentrations. Those studies were conducted at RH levels at or below 40 % with reported SOA mass yields ranging from 0.3 % up to 9.0 %, indicating considerable variations. A robust feature of our associated gas–particle partitioning calculations covering the whole RH range is the predicted enhancement of SOA yield at high RH (> 80 %) compared to low RH (dry) conditions, which is explained by the effect of particle water uptake and its impact on the equilibrium partitioning of all components. 
    more » « less