skip to main content

Title: Explaining a series of models by propagating Shapley values
Abstract

Local feature attribution methods are increasingly used to explain complex machine learning models. However, current methods are limited because they are extremely expensive to compute or are not capable of explaining a distributed series of models where each model is owned by a separate institution. The latter is particularly important because it often arises in finance where explanations are mandated. Here, we present Generalized DeepSHAP (G-DeepSHAP), a tractable method to propagate local feature attributions through complex series of models based on a connection to the Shapley value. We evaluate G-DeepSHAP across biological, health, and financial datasets to show that it provides equally salient explanations an order of magnitude faster than existing model-agnostic attribution techniques and demonstrate its use in an important distributed series of models setting.

Authors:
; ;
Publication Date:
NSF-PAR ID:
10381675
Journal Name:
Nature Communications
Volume:
13
Issue:
1
ISSN:
2041-1723
Publisher:
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. Chaudhuri, Kamalika ; Jegelka, Stefanie ; Song, Le ; Szepesvari, Csaba ; Niu, Gang ; Sabato, Sivan (Ed.)
    Recent work has found that adversarially-robust deep networks used for image classification are more interpretable: their feature attributions tend to be sharper, and are more concentrated on the objects associated with the image’s ground- truth class. We show that smooth decision boundaries play an important role in this enhanced interpretability, as the model’s input gradients around data points will more closely align with boundaries’ normal vectors when they are smooth. Thus, because robust models have smoother boundaries, the results of gradient- based attribution methods, like Integrated Gradients and DeepLift, will capture more accurate information about nearby decision boundaries. This understanding of robust interpretability leads to our second contribution: boundary attributions, which aggregate information about the normal vectors of local decision bound- aries to explain a classification outcome. We show that by leveraging the key fac- tors underpinning robust interpretability, boundary attributions produce sharper, more concentrated visual explanations{—}even on non-robust models.
  2. ABSTRACT

    In the interstellar medium (ISM), the formation of complex organic molecules (COMs) is largely facilitated by surface reactions. However, in cold dark clouds, thermal desorption of COMs is inefficient because of the lack of thermal energy to overcome binding energies to the grain surface. Non-thermal desorption methods are therefore important explanations for the gas-phase detection of many COMs that are primarily formed on grains. Here, we present a new non-thermal desorption process: cosmic ray sputtering of grain ice surfaces based on water, carbon dioxide, and a simple mixed ice. Our model applies estimated rates of sputtering to the three-phase rate equation model nautilus-1.1, where this inclusion results in enhanced gas-phase abundances for molecules produced by grain reactions such as methanol (CH3OH) and methyl formate (HCOOCH3). Notably, species with efficient gas-phase destruction pathways exhibit less of an increase in models with sputtering compared to other molecules. These model results suggest that sputtering is an efficient, non-specific method of non-thermal desorption that should be considered as an important factor in future chemical models.

  3. Feature attributions and counterfactual explanations are popular approaches to explain a ML model. The former assigns an importance score to each input feature, while the latter provides input examples with minimal changes to alter the model's predictions. To unify these approaches, we provide an interpretation based on the actual causality framework and present two key results in terms of their use. First, we present a method to generate feature attribution explanations from a set of counterfactual examples. These feature attributions convey how important a feature is to changing the classification outcome of a model, especially on whether a subset of features is necessary and/or sufficient for that change, which attribution-based methods are unable to provide. Second, we show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency. As a result, we highlight the complimentary of these two approaches. Our evaluation on three benchmark datasets --- Adult-Income, LendingClub, and German-Credit --- confirms the complimentary. Feature attribution methods like LIME and SHAP and counterfactual explanation methods like Wachter et al. and DiCE often do not agree on feature importance rankings. In addition, by restricting the features that can be modifiedmore »for generating counterfactual examples, we find that the top-k features from LIME or SHAP are often neither necessary nor sufficient explanations of a model's prediction. Finally, we present a case study of different explanation methods on a real-world hospital triage problem.« less
  4. SHAP explanations are a popular feature-attribution mechanism for explainable AI. They use game-theoretic notions to measure the influence of individual features on the prediction of a machine learning model. Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently. In this paper, we establish the complexity of computing the SHAP explanation in three important settings. First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model. This fully-factorized setting is often used to simplify the SHAP computation, yet our results show that the computation can be intractable for commonly used models such as logistic regression. Going beyond fully-factorized distributions, we show that computing SHAP explanations is already intractable for a very simple setting: computing SHAP explanations of trivial classifiers over naive Bayes distributions. Finally, we show that even computing SHAP over the empirical distribution is #P-hard.
  5. SHAP explanations are a popular feature-attribution mechanism for explainable AI. They use game-theoretic notions to measure the influence of individual features on the prediction of a machine learning model. Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently. In this paper, we establish the complexity of computing the SHAP explanation in three important settings. First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model. This fully-factorized setting is often used to simplify the SHAP computation, yet our results show that the computation can be intractable for commonly used models such as logistic regression. Going beyond fully-factorized distributions, we show that computing SHAP explanations is already intractable for a very simple setting: computing SHAP explanations of trivial classifiers over naive Bayes distributions. Finally, we show that even computing SHAP over the empirical distribution is #P-hard.