Counterfactual examples are one of the most commonlycited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or finetuned. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions, such as weight initialization and leaveoneout variations in data, as often occurs during model deployment. We demonstrate experimentally that counterfactual examples for deep models are often inconsistent across such small changes, and that increasing the cost of the counterfactual, a stabilityenhancing mitigation suggested by prior work in the context of simpler models, is not a reliable heuristic in deep networks. Rather, our analysis shows that a model's local Lipschitz continuity around the counterfactual is key to its consistency across related models. To this end, we propose Stable Neighbor Search as a way to generate more consistent counterfactual explanations, and illustrate the effectiveness of this approach on several benchmark datasets.
This content will become publicly available on January 1, 2024
Counterfactuals for the Future
Counterfactuals are often described as 'retrospective,' focusing on hypothetical alternatives to a realized past. This description relates to an often implicit assumption about the structure and stability of exogenous variables in the system being modeled –– an assumption that is reasonable in many settings where counterfactuals are used. In this work, we consider cases where we might reasonably make a different assumption about exogenous variables, namely, that the exogenous noise terms of each unit do exhibit some unitspecific structure and/or stability. This leads us to a different use of counterfactuals — a 'forwardlooking' rather than 'retrospective' counterfactual. We introduce counterfactual treatment choice, a type of treatment choice problem that motivates using forwardlooking counterfactuals. We then explore how mismatches between interventional versus forwardlooking counterfactual approaches to treatment choice, consistent with different assumptions about exogenous noise, can lead to counterintuitive results.
 Publication Date:
 NSFPAR ID:
 10388972
 Journal Name:
 Proceedings of the AAAI Conference on Artificial Intelligence
 Sponsoring Org:
 National Science Foundation
More Like this


Counterfactual examples are one of the most commonlycited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or finetuned. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions, such as weight initialization and leaveoneout variations in data, as often occurs during model deployment. We demonstrate experimentally that counterfactual examples for deep models are often inconsistent across such small changes, and that increasing the cost of the counterfactual, a stabilityenhancing mitigation suggested by prior work in the context of simpler models, is not a reliable heuristic in deep networks. Rather, our analysis shows that a model's local Lipschitz continuity around the counterfactual is key to its consistency across related models. To this end, we propose Stable Neighbor Search as a way to generate more consistent counterfactual explanations, and illustrate the effectiveness of this approach on several benchmark datasets.

Treeform sequential decision making (TFSDM) extends classical oneshot decision making by modeling treeform interactions between an agent and a potentially adversarial environment. It captures the online decisionmaking problems that each player faces in an extensiveform game, as well as Markov decision processes and partiallyobservable Markov decision processes where the agent conditions on observed history. Over the past decade, there has been considerable effort into designing online optimization methods for TFSDM. Virtually all of that work has been in the fullfeedback setting, where the agent has access to counterfactuals, that is, information on what would have happened had the agent chosen a different action at any decision node. Little is known about the bandit setting, where that assumption is reversed (no counterfactual information is available), despite this latter setting being well understood for almost 20 years in oneshot decision making. In this paper, we give the first algorithm for the bandit linear optimization problem for TFSDM that offers both (i) lineartime iterations (in the size of the decision tree) and (ii) O(T−−√) cumulative regret in expectation compared to any fixed strategy, at all times T. This is made possible by new results that we derive, which may have independent uses asmore »

Treeform sequential decision making (TFSDM) extends classical oneshot decision making by modeling treeform interactions between an agent and a potentially adversarial environment. It captures the online decisionmaking problems that each player faces in an extensiveform game, as well as Markov decision processes and partiallyobservable Markov decision processes where the agent conditions on observed history. Over the past decade, there has been considerable effort into designing online optimization methods for TFSDM. Virtually all of that work has been in the fullfeedback setting, where the agent has access to counterfactuals, that is, information on what would have happened had the agent chosen a different action at any decision node. Little is known about the bandit setting, where that assumption is reversed (no counterfactual information is available), despite this latter setting being well understood for almost 20 years in oneshot decision making. In this paper, we give the first algorithm for the bandit linear optimization problem for TFSDM that offers both (i) lineartime iterations (in the size of the decision tree) and (ii) O(T−−√) cumulative regret in expectation compared to any fixed strategy, at all times T. This is made possible by new results that we derive, which may have independent uses asmore »

S. Koyejo ; S. Mohamed ; A. Agarwal ; D. Belgrave ; K. Cho, A. Oh (Ed.)We focus on causal discovery in the presence of measurement error in linear systems where the mixing matrix, i.e., the matrix indicating the independent exogenous noise terms pertaining to the observed variables, is identified up to permutation and scaling of the columns. We demonstrate a somewhat surprising connection between this problem and causal discovery in the presence of unobserved parentless causes, in the sense that there is a mapping, given by the mixing matrix, between the underlying models to be inferred in these problems. Consequently, any identifiability result based on the mixing matrix for one model translates to an identifiability result for the other model. We characterize to what extent the causal models can be identified under a twopart faithfulness assumption. Under only the first part of the assumption (corresponding to the conventional definition of faithfulness), the structure can be learned up to the causal ordering among an ordered grouping of the variables but not all the edges across the groups can be identified. We further show that if both parts of the faithfulness assumption are imposed, the structure can be learned up to a more refined ordered grouping. As a result of this refinement, for the latent variable modelmore »