Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

van der Schaar, M. ; Zhang, C. ; Janzing, D. (Ed.)A Bayesian Network is a directed acyclic graph (DAG) on a set of n random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite kmixture of such models is graphically represented by a larger graph which has an additional “hidden” (or “latent”) random variable U, ranging in {1,...,k}, and a directed edge from U to every other vertex. Models of this type are fundamental to causal inference, where U models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution with U, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more wellstudied “product” case on empty graphs, we give the first algorithm to learn mixtures of nonempty DAGs.more » « lessFree, publiclyaccessible full text available April 11, 2024

In the Euclidean kMeans problem we are given a collection of n points D in an Euclidean space and a positive integer k. Our goal is to identify a collection of k points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in D and the closest center. This problem is known to be APXhard and the current best approximation ratio is a primaldual 6.357 approximation based on a standard LP for the problem [Ahmadian et al. FOCS'17, SICOMP'20]. In this note we show how a minor modification of Ahmadian et al.'s analysis leads to a slightly improved 6.12903 approximation. As a related result, we also show that the mentioned LP has integrality gap at least (16+Sqrt(5))/15 > 1.2157. .more » « less

Belkin, M ; Kpotufe, S (Ed.)We give an algorithm for source identification of a mixture of k product distributions on n bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using 2^O(k^2) n^O(k) arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O’Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and MaddahAli (ISIT 2018).more » « less

An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.more » « less

de Campos, C. ; Maathuis, M. H. (Ed.)An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.more » « less

The classic graphical Cheeger inequalities state that if $M$ is an $n\times n$ \emph{symmetric} doubly stochastic matrix, then \[ \frac{1\lambda_{2}(M)}{2}\leq\phi(M)\leq\sqrt{2\cdot(1\lambda_{2}(M))} \] where $\phi(M)=\min_{S\subseteq[n],S\leq n/2}\left(\frac{1}{S}\sum_{i\in S,j\not\in S}M_{i,j}\right)$ is the edge expansion of $M$, and $\lambda_{2}(M)$ is the second largest eigenvalue of $M$. We study the relationship between $\phi(A)$ and the spectral gap $1\re\lambda_{2}(A)$ for \emph{any} doubly stochastic matrix $A$ (not necessarily symmetric), where $\lambda_{2}(A)$ is a nontrivial eigenvalue of $A$ with maximum real part. Fiedler showed that the upper bound on $\phi(A)$ is unaffected, i.e., $\phi(A)\leq\sqrt{2\cdot(1\re\lambda_{2}(A))}$. With regards to the lower bound on $\phi(A)$, there are known constructions with \[ \phi(A)\in\Theta\left(\frac{1\re\lambda_{2}(A)}{\log n}\right), \] indicating that at least a mild dependence on $n$ is necessary to lower bound $\phi(A)$. In our first result, we provide an \emph{exponentially} better construction of $n\times n$ doubly stochastic matrices $A_{n}$, for which \[ \phi(A_{n})\leq\frac{1\re\lambda_{2}(A_{n})}{\sqrt{n}}. \] In fact, \emph{all} nontrivial eigenvalues of our matrices are $0$, even though the matrices are highly \emph{nonexpanding}. We further show that this bound is in the correct range (up to the exponent of $n$), by showing that for any doubly stochastic matrix $A$, \[ \phi(A)\geq\frac{1\re\lambda_{2}(A)}{35\cdot n}. \] As a consequence, unlike the symmetric case, there is a (necessary) loss of a factor of $n^{\alpha}$ for $\frac{1}{2}\leq\alpha\leq1$ in lower bounding $\phi$ by the spectral gap in the nonsymmetric setting. Our second result extends these bounds to general matrices $R$ with nonnegative entries, to obtain a twosided \emph{gapped} refinement of the PerronFrobenius theorem. Recall from the PerronFrobenius theorem that for such $R$, there is a nonnegative eigenvalue $r$ such that all eigenvalues of $R$ lie within the closed disk of radius $r$ about $0$. Further, if $R$ is irreducible, which means $\phi(R)>0$ (for suitably defined $\phi$), then $r$ is positive and all other eigenvalues lie within the \textit{open} disk, so (with eigenvalues sorted by real part), $\re\lambda_{2}(R)
more » « less