skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On Efficient and Scalable Computation of the Nonparametric Maximum Likelihood Estimator in Mixture Models
In this paper, we focus on the computation of the nonparametric maximum likelihood es- timator (NPMLE) in multivariate mixture models. Our approach discretizes this infinite dimensional convex optimization problem by setting fixed support points for the NPMLE and optimizing over the mixing proportions. We propose an efficient and scalable semis- mooth Newton based augmented Lagrangian method (ALM). Our algorithm outperforms the state-of-the-art methods (Kim et al., 2020; Koenker and Gu, 2017), capable of handling n ≈ 106 data points with m ≈ 104 support points. A key advantage of our approach is its strategic utilization of the solution’s sparsity, leading to structured sparsity in Hessian computations. As a result, our algorithm demonstrates better scaling in terms of m when compared to the mixsqp method (Kim et al., 2020). The computed NPMLE can be directly applied to denoising the observations in the framework of empirical Bayes. We propose new denoising estimands in this context along with their consistent estimates. Extensive nu- merical experiments are conducted to illustrate the efficiency of our ALM. In particular, we employ our method to analyze two astronomy data sets: (i) Gaia-TGAS Catalog (Anderson et al., 2018) containing approximately 1.4 × 106 data points in two dimensions, and (ii) a data set from the APOGEE survey (Majewski et al., 2017) with approximately 2.7 × 104 data points.  more » « less
Award ID(s):
2416172 2416250
PAR ID:
10496952
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Microtome
Date Published:
Journal Name:
Journal of machine learning research
ISSN:
1532-4435
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Online Matrix Factorization (OMF) is a fundamental tool for dictionary learning problems,giving an approximate representation of complex data sets in terms of a reduced number ofextracted features. Convergence guarantees for most of the OMF algorithms in the litera-ture assume independence between data matrices, and the case of dependent data streamsremains largely unexplored. In this paper, we show that a non-convex generalization ofthe well-known OMF algorithm for i.i.d. stream of data in (Mairal et al., 2010) convergesalmost surely to the set of critical points of the expected loss function, even when the datamatrices are functions of some underlying Markov chain satisfying a mild mixing condition.This allows one to extract features more efficiently from dependent data streams, as thereis no need to subsample the data sequence to approximately satisfy the independence as-sumption. As the main application, by combining online non-negative matrix factorizationand a recent MCMC algorithm for sampling motifs from networks, we propose a novelframework ofNetwork Dictionary Learning, which extracts “network dictionary patches”from a given network in an online manner that encodes main features of the network. Wedemonstrate this technique and its application to network denoising problems on real-worldnetwork data 
    more » « less
  2. We introduce a new distance-preserving compact representation of multi-dimensional point-sets. Given n points in a d-dimensional space where each coordinate is represented using B bits (i.e., dB bits per point), it produces a representation of size O( d log(d B/epsilon) +log n) bits per point from which one can approximate the distances up to a factor of 1 + epsilon. Our algorithm almost matches the recent bound of Indyk et al, 2017} while being much simpler. We compare our algorithm to Product Quantization (PQ) (Jegou et al, 2011) a state of the art heuristic metric compression method. We evaluate both algorithms on several data sets: SIFT, MNIST, New York City taxi time series and a synthetic one-dimensional data set embedded in a high-dimensional space. Our algorithm produces representations that are comparable to or better than those produced by PQ, while having provable guarantees on its performance. 
    more » « less
  3. We study the problem of fair allocation of M indivisible items among N agents using the popular notion of maximin share as our measure of fairness. The maximin share of an agent is the largest value she can guarantee herself if she is allowed to choose a partition of the items into N bundles (one for each agent), on the condition that she receives her least preferred bundle. A maximin share allocation provides each agent a bundle worth at least their maximin share. While it is known that such an allocation need not exist [Procaccia and Wang, 2014; Kurokawa et al., 2016], a series of work [Procaccia and Wang, 2014; David Kurokawa et al., 2018; Amanatidis et al., 2017; Barman and Krishna Murthy, 2017] provided 2/3 approximation algorithms in which each agent receives a bundle worth at least 2/3 times their maximin share. Recently, [Ghodsi et al., 2018] improved the approximation guarantee to 3/4. Prior works utilize intricate algorithms, with an exception of [Barman and Krishna Murthy, 2017] which is a simple greedy solution but relies on sophisticated analysis techniques. In this paper, we propose an alternative 2/3 maximin share approximation which offers both a simple algorithm and straightforward analysis. In contrast to other algorithms, our approach allows for a simple and intuitive understanding of why it works. 
    more » « less
  4. null (Ed.)
    We implement the formalization of natural logic-like monotonic inference using Unscoped Episodic Logical Forms (ULFs) by Kim et al. (2020). We demonstrate this system’s capacity to handle a variety of challenging semantic phenomena using the FraCaS dataset (Cooper et al., 1996).These results give empirical evidence for prior claims that ULF is an appropriate representation to mediate natural logic-like inferences. 
    more » « less
  5. We give two new quantum algorithms for solving semidefinite programs (SDPs) providing quantum speed-ups. We consider SDP instances with m constraint matrices, each of dimension n, rank at most r, and sparsity s. The first algorithm assumes an input model where one is given access to an oracle to the entries of the matrices at unit cost. We show that it has run time O~(s^2 (sqrt{m} epsilon^{-10} + sqrt{n} epsilon^{-12})), with epsilon the error of the solution. This gives an optimal dependence in terms of m, n and quadratic improvement over previous quantum algorithms (when m ~~ n). The second algorithm assumes a fully quantum input model in which the input matrices are given as quantum states. We show that its run time is O~(sqrt{m}+poly(r))*poly(log m,log n,B,epsilon^{-1}), with B an upper bound on the trace-norm of all input matrices. In particular the complexity depends only polylogarithmically in n and polynomially in r. We apply the second SDP solver to learn a good description of a quantum state with respect to a set of measurements: Given m measurements and a supply of copies of an unknown state rho with rank at most r, we show we can find in time sqrt{m}*poly(log m,log n,r,epsilon^{-1}) a description of the state as a quantum circuit preparing a density matrix which has the same expectation values as rho on the m measurements, up to error epsilon. The density matrix obtained is an approximation to the maximum entropy state consistent with the measurement data considered in Jaynes' principle from statistical mechanics. As in previous work, we obtain our algorithm by "quantizing" classical SDP solvers based on the matrix multiplicative weight update method. One of our main technical contributions is a quantum Gibbs state sampler for low-rank Hamiltonians, given quantum states encoding these Hamiltonians, with a poly-logarithmic dependence on its dimension, which is based on ideas developed in quantum principal component analysis. We also develop a "fast" quantum OR lemma with a quadratic improvement in gate complexity over the construction of Harrow et al. [Harrow et al., 2017]. We believe both techniques might be of independent interest. 
    more » « less