This content will become publicly available on April 1, 2024
 Award ID(s):
 1942239
 NSFPAR ID:
 10410531
 Editor(s):
 Francisco Ruiz, Jennifer Dy
 Date Published:
 Journal Name:
 Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
 Volume:
 206
 Page Range / eLocation ID:
 41814195
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

Probabilistic graphical models, such as Markov random fields (MRF), exploit dependencies among random variables to model a rich family of joint probability distributions. Inference algorithms, such as belief propagation (BP), can effectively compute the marginal posteriors for decision making. Nonetheless, inferences involve sophisticated probability calculations and are difficult for humans to interpret. Among all existing explanation methods for MRFs, no method is designed for fair attributions of an inference outcome to elements on the MRF where the inference takes place. Shapley values provide rigorous attributions but so far have not been studied on MRFs. We thus define Shapley values for MRFs to capture both probabilistic and topological contributions of the variables on MRFs. We theoretically characterize the new definition regarding independence, equal contribution, additivity, and submodularity. As bruteforce computation of the Shapley values is challenging, we propose GraphShapley, an approximation algorithm that exploits the decomposability of Shapley values, the structure of MRFs, and the iterative nature of BP inference to speed up the computation. In practice, we propose metaexplanations to explain the Shapley values and make them more accessible and trustworthy to human users. On four synthetic and nine realworld MRFs, we demonstrate that GraphShapley generates sensible and practical explanations.more » « less

Identification theory for causal effects in causal models associated with hidden variable directed acyclic graphs (DAGs) is well studied. However, the corresponding algorithms are underused due to the complexity of estimating the identifying functionals they output. In this work, we bridge the gap between identification and estimation of populationlevel causal effects involving a single treatment and a single outcome. We derive influence function based estimators that exhibit double robustness for the identified effects in a large class of hidden variable DAGs where the treatment satisfies a simple graphical criterion; this class includes models yielding the adjustment and frontdoor functionals as special cases. We also provide necessary and sufficient conditions under which the statistical model of a hidden variable DAG is nonparametrically saturated and implies no equality constraints on the observed data distribution. Further, we derive an important class of hidden variable DAGs that imply observed data distributions observationally equivalent (up to equality constraints) to fully observed DAGs. In these classes of DAGs, we derive estimators that achieve the semiparametric efficiency bounds for the target of interest where the treatment satisfies our graphical criterion. Finally, we provide a sound and complete identification algorithm that directly yields a weight based estimation strategy for any identifiable effect in hidden variable causal models.more » « less

Nonlinear statespace models are ubiquitous in modeling realworld dynamical systems. Sequential Monte Carlo (SMC) techniques, also known as particle methods, are a wellknown class of parameter estimation methods for this general class of statespace models. Existing SMCbased techniques rely on excessive sampling of the parameter space, which makes their computation intractable for large systems or tall data sets. Bayesian optimization techniques have been used for fast inference in statespace models with intractable likelihoods. These techniques aim to find the maximum of the likelihood function by sequential sampling of the parameter space through a single SMC approximator. Various SMC approximators with different fidelities and computational costs are often available for sample based likelihood approximation. In this paper, we propose a multifidelity Bayesian optimization algorithm for the inference of general nonlinear statespace models (MFBOSSM), which enables simultaneous sequential selection of parameters and approximators. The accuracy and speed of the algorithm are demonstrated by numerical experiments using synthetic gene expression data from a gene regulatory network model and real data from the VIX stock price index.more » « less

van der Schaar, M. ; Zhang, C. ; Janzing, D. (Ed.)A Bayesian Network is a directed acyclic graph (DAG) on a set of n random variables (the vertices); a Bayesian Network Distribution (BND) is a probability distribution on the random variables that is Markovian on the graph. A finite kmixture of such models is graphically represented by a larger graph which has an additional “hidden” (or “latent”) random variable U, ranging in {1,...,k}, and a directed edge from U to every other vertex. Models of this type are fundamental to causal inference, where U models an unobserved confounding effect of multiple populations, obscuring the causal relationships in the observable DAG. By solving the mixture problem and recovering the joint probability distribution with U, traditionally unidentifiable causal relationships become identifiable. Using a reduction to the more wellstudied “product” case on empty graphs, we give the first algorithm to learn mixtures of nonempty DAGs.more » « less

We propose a general method for constructing confidence sets and hypothesis tests that have finitesample guarantees without regularity conditions. We refer to such procedures as “universal.” The method is very simple and is based on a modified version of the usual likelihoodratio statistic that we call “the split likelihoodratio test” (split LRT) statistic. The (limiting) null distribution of the classical likelihoodratio statistic is often intractable when used to test composite null hypotheses in irregular statistical models. Our method is especially appealing for statistical inference in these complex setups. The method we suggest works for any parametric model and also for some nonparametric models, as long as computing a maximumlikelihood estimator (MLE) is feasible under the null. Canonical examples arise in mixture modeling and shapeconstrained inference, for which constructing tests and confidence sets has been notoriously difficult. We also develop various extensions of our basic methods. We show that in settings when computing the MLE is hard, for the purpose of constructing valid tests and intervals, it is sufficient to upper bound the maximum likelihood. We investigate some conditions under which our methods yield valid inferences under model misspecification. Further, the split LRT can be used with profile likelihoods to deal with nuisance parameters, and it can also be run sequentially to yield anytimevalid P values and confidence sequences. Finally, when combined with the method of sieves, it can be used to perform model selection with nested model classes.