NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Causal Inference Despite Limited Global Confounding via Mixture Models

Gordon, S.; Mazaheri, B.; Rabani, Y.; Schulman, L. (April 2023, Proceedings of Machine Learning Research)
van der Schaar, M.; Zhang, C.; Janzing, D. (Ed.)
A Bayesian Network is a directed acyclic graph (DAG) on a set of n random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite k-mixture of such models is graphically represented by a larger graph which has an additional “hidden” (or “latent”) random variable U, ranging in {1,...,k}, and a directed edge from U to every other vertex. Models of this type are fundamental to causal inference, where U models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution with U, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more well-studied “product” case on empty graphs, we give the first algorithm to learn mixtures of non-empty DAGs.
more » « less
Full Text Available
A refined approximation for Euclidean k-means

https://doi.org/10.1016/j.ipl.2022.106251

Grandoni, Fabrizio; Ostrovsky, Rafail; Rabani, Yuval; Schulman, Leonard J.; Venkat, Rakesh (June 2022, Information Processing Letters)

In the Euclidean k-Means problem we are given a collection of n points D in an Euclidean space and a positive integer k. Our goal is to identify a collection of k points in the same space (centers) so as to minimize the sum of the squared Euclidean distances between each point in D and the closest center. This problem is known to be APX-hard and the current best approximation ratio is a primal-dual 6.357 approximation based on a standard LP for the problem [Ahmadian et al. FOCS'17, SICOMP'20]. In this note we show how a minor modification of Ahmadian et al.'s analysis leads to a slightly improved 6.12903 approximation. As a related result, we also show that the mentioned LP has integrality gap at least (16+Sqrt(5))/15 > 1.2157. .
more » « less
Full Text Available
Hadamard Extensions and the Identification of Mixtures of Product Distributions

https://doi.org/10.1109/TIT.2022.3146630

Gordon, Spencer L.; Schulman, Leonard J. (June 2022, IEEE Transactions on Information Theory)

Full Text Available
The invisible hand of Laplace: The role of market structure in price convergence and oscillation

https://doi.org/10.1016/j.jmateco.2021.102475

Rabani, Yuval; Schulman, Leonard J. (August 2021, Journal of Mathematical Economics)
null (Ed.)
Full Text Available
Source Identification for Mixtures of Product Distributions

Gordon, S; Mazaheri, B H; Rabani, Y; Schulman, L J (August 2021, Proceedings of Thirty Fourth Conference on Learning Theory)
Belkin, M; Kpotufe, S (Ed.)
We give an algorithm for source identification of a mixture of k product distributions on n bits. This is a fundamental problem in machine learning with many applications. Our algorithm identifies the source parameters of an identifiable mixture, given, as input, approximate values of multilinear moments (derived, for instance, from a sufficiently large sample), using 2^O(k^2) n^O(k) arithmetic operations. Our result is the first explicit bound on the computational complexity of source identification of such mixtures. The running time improves previous results by Feldman, O’Donnell, and Servedio (FOCS 2005) and Chen and Moitra (STOC 2019) that guaranteed only learning the mixture (without parametric identification of the source). Our analysis gives a quantitative version of a qualitative characterization of identifiable sources that is due to Tahmasebi, Motahari, and Maddah-Ali (ISIT 2018).
more » « less
Full Text Available
Condition number bounds for causal inference

Gordon, S.L.; Kumar, V. M.; Schulman, L. J.; Srivastava, P. (January 2021, Proceedings of Machine Learning Research)
de Campos, C.; Maathuis, M. H. (Ed.)
An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.
more » « less
Full Text Available
Condition number bounds for causal inference

Gordon, S. L.; Kumar, V. M.; Schulman, L. J.; Srivastava, P. (January 2021, Proceedings of Machine Learning Research)

An important achievement in the field of causal inference was a complete characterization of when a causal effect, in a system modeled by a causal graph, can be determined uniquely from purely observational data. The identification algorithms resulting from this work produce exact symbolic expressions for causal effects, in terms of the observational probabilities. More recent work has looked at the numerical properties of these expressions, in particular using the classical notion of the condition number. In its classical interpretation, the condition number quantifies the sensitivity of the output values of the expressions to small numerical perturbations in the input observational probabilities. In the context of causal identification, the condition number has also been shown to be related to the effect of certain kinds of uncertainties in the structure of the causal graphical model. In this paper, we first give an upper bound on the condition number for the interesting case of causal graphical models with small “confounded components”. We then develop a tight characterization of the condition number of any given causal identification problem. Finally, we use our tight characterization to give a specific example where the condition number can be much lower than that obtained via generic bounds on the condition number, and to show that even “equivalent” expressions for causal identification can behave very differently with respect to their numerical stability properties.
more » « less
Full Text Available
Convergence of incentive-driven dynamics in Fisher markets

https://doi.org/10.1016/j.geb.2020.11.005

Dvijotham, Krishnamurthy; Rabani, Yuval; Schulman, Leonard J. (December 2020, Games and Economic Behavior)
null (Ed.)
Full Text Available
Edge Expansion and Spectral Gap of Nonnegative Matrices

https://doi.org/10.1137/1.9781611975994.73

Mehta, Jenish C.; Schulman, Leonard J. (January 2020, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms)

The classic graphical Cheeger inequalities state that if $$M$$ is an $$n\times n$$ \emph{symmetric} doubly stochastic matrix, then \[ \frac{1-\lambda_{2}(M)}{2}\leq\phi(M)\leq\sqrt{2\cdot(1-\lambda_{2}(M))} \] where $$\phi(M)=\min_{S\subseteq[n],|S|\leq n/2}\left(\frac{1}{|S|}\sum_{i\in S,j\not\in S}M_{i,j}\right)$$ is the edge expansion of $$M$$, and $$\lambda_{2}(M)$$ is the second largest eigenvalue of $$M$$. We study the relationship between $$\phi(A)$$ and the spectral gap $$1-\re\lambda_{2}(A)$$ for \emph{any} doubly stochastic matrix $$A$$ (not necessarily symmetric), where $$\lambda_{2}(A)$$ is a nontrivial eigenvalue of $$A$$ with maximum real part. Fiedler showed that the upper bound on $$\phi(A)$$ is unaffected, i.e., $$\phi(A)\leq\sqrt{2\cdot(1-\re\lambda_{2}(A))}$$. With regards to the lower bound on $$\phi(A)$$, there are known constructions with \[ \phi(A)\in\Theta\left(\frac{1-\re\lambda_{2}(A)}{\log n}\right), \] indicating that at least a mild dependence on $$n$$ is necessary to lower bound $$\phi(A)$$. In our first result, we provide an \emph{exponentially} better construction of $$n\times n$$ doubly stochastic matrices $$A_{n}$$, for which \[ \phi(A_{n})\leq\frac{1-\re\lambda_{2}(A_{n})}{\sqrt{n}}. \] In fact, \emph{all} nontrivial eigenvalues of our matrices are $$0$$, even though the matrices are highly \emph{nonexpanding}. We further show that this bound is in the correct range (up to the exponent of $$n$$), by showing that for any doubly stochastic matrix $$A$$, \[ \phi(A)\geq\frac{1-\re\lambda_{2}(A)}{35\cdot n}. \] As a consequence, unlike the symmetric case, there is a (necessary) loss of a factor of $$n^{\alpha}$ for $$\frac{1}{2}\leq\alpha\leq1$$ in lower bounding $$\phi$$ by the spectral gap in the nonsymmetric setting. Our second result extends these bounds to general matrices $$R$$ with nonnegative entries, to obtain a two-sided \emph{gapped} refinement of the Perron-Frobenius theorem. Recall from the Perron-Frobenius theorem that for such $$R$$, there is a nonnegative eigenvalue $$r$$ such that all eigenvalues of $$R$$ lie within the closed disk of radius $$r$$ about $$0$$. Further, if $$R$$ is irreducible, which means $$\phi(R)>0$$ (for suitably defined $$\phi$$), then $$r$$ is positive and all other eigenvalues lie within the \textit{open} disk, so (with eigenvalues sorted by real part), $$\re\lambda_{2}(R)
more » « less
Full Text Available

Search for: All records