Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Agrawal, Shipra; Roth, Aaron (Ed.)We consider the problem of \emph{identifying,} from statistics, a distribution of discrete random variables $$X_1 \ldots,X_n$$ that is a mixture of $$k$$ product distributions. The best previous sample complexity for $$n \in O(k)$$ was $$(1/\zeta)^{O(k^2 \log k)}$$ (under a mild separation assumption parameterized by $$\zeta$$). The best known lower bound was $$\exp(\Omega(k))$$. It is known that $$n\geq 2k-1$$ is necessary and sufficient for identification. We show, for any $$n\geq 2k-1$$, how to achieve sample complexity and run-time complexity $$(1/\zeta)^{O(k)}$$. We also extend the known lower bound of $$e^{\Omega(k)}$$ to match our upper bound across a broad range of $$\zeta$$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.more » « less
-
Product of experts (PoE) are layered networks in which the value at each node is an AND (or product) of the values (possibly negated) at its inputs. These were introduced as a neural network architecture that can efficiently learn to generate high-dimensional data which satisfy many low-dimensional constraints---thereby allowing each individual expert to perform a simple task. PoEs have found a variety of applications in learning. We study the problem of identifiability of a product of experts model having a layer of binary latent variables, and a layer of binary observables that are iid conditional on the latents. The previous best upper bound on the number of observables needed to identify the model was exponential in the number of parameters. We show: (a) When the latents are uniformly distributed, the model is identifiable with a number of observables equal to the number of parameters (and hence best possible). (b) In the more general case of arbitrarily distributed latents, the model is identifiable for a number of observables that is still linear in the number of parameters (and within a factor of two of best-possible). The proofs rely on root interlacing phenomena for some special three-term recurrences.more » « less
-
In the Euclidean k-Means problem we are given a collection of n points D in an Euclidean space and a positive integer k. Our goal is to identify a collection of k points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in D and the closest center. This problem is known to be APX-hard and the current best approximation ratio is a primal-dual 6.357 approximation based on a standard LP for the problem [Ahmadian et al. FOCS'17, SICOMP'20]. In this note we show how a minor modification of Ahmadian et al.'s analysis leads to a slightly improved 6.12903 approximation. As a related result, we also show that the mentioned LP has integrality gap at least (16+Sqrt(5))/15 > 1.2157. .more » « less
-
The classic graphical Cheeger inequalities state that if $$M$$ is an $$n\times n$$ \emph{symmetric} doubly stochastic matrix, then \[ \frac{1-\lambda_{2}(M)}{2}\leq\phi(M)\leq\sqrt{2\cdot(1-\lambda_{2}(M))} \] where $$\phi(M)=\min_{S\subseteq[n],|S|\leq n/2}\left(\frac{1}{|S|}\sum_{i\in S,j\not\in S}M_{i,j}\right)$$ is the edge expansion of $$M$$, and $$\lambda_{2}(M)$$ is the second largest eigenvalue of $$M$$. We study the relationship between $$\phi(A)$$ and the spectral gap $$1-\re\lambda_{2}(A)$$ for \emph{any} doubly stochastic matrix $$A$$ (not necessarily symmetric), where $$\lambda_{2}(A)$$ is a nontrivial eigenvalue of $$A$$ with maximum real part. Fiedler showed that the upper bound on $$\phi(A)$$ is unaffected, i.e., $$\phi(A)\leq\sqrt{2\cdot(1-\re\lambda_{2}(A))}$$. With regards to the lower bound on $$\phi(A)$$, there are known constructions with \[ \phi(A)\in\Theta\left(\frac{1-\re\lambda_{2}(A)}{\log n}\right), \] indicating that at least a mild dependence on $$n$$ is necessary to lower bound $$\phi(A)$$. In our first result, we provide an \emph{exponentially} better construction of $$n\times n$$ doubly stochastic matrices $$A_{n}$$, for which \[ \phi(A_{n})\leq\frac{1-\re\lambda_{2}(A_{n})}{\sqrt{n}}. \] In fact, \emph{all} nontrivial eigenvalues of our matrices are $$0$$, even though the matrices are highly \emph{nonexpanding}. We further show that this bound is in the correct range (up to the exponent of $$n$$), by showing that for any doubly stochastic matrix $$A$$, \[ \phi(A)\geq\frac{1-\re\lambda_{2}(A)}{35\cdot n}. \] As a consequence, unlike the symmetric case, there is a (necessary) loss of a factor of $$n^{\alpha}$ for $$\frac{1}{2}\leq\alpha\leq1$$ in lower bounding $$\phi$$ by the spectral gap in the nonsymmetric setting. Our second result extends these bounds to general matrices $$R$$ with nonnegative entries, to obtain a two-sided \emph{gapped} refinement of the Perron-Frobenius theorem. Recall from the Perron-Frobenius theorem that for such $$R$$, there is a nonnegative eigenvalue $$r$$ such that all eigenvalues of $$R$$ lie within the closed disk of radius $$r$$ about $$0$$. Further, if $$R$$ is irreducible, which means $$\phi(R)>0$$ (for suitably defined $$\phi$$), then $$r$$ is positive and all other eigenvalues lie within the \textit{open} disk, so (with eigenvalues sorted by real part), $$\re\lambda_{2}(R)more » « less