The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a user's ability to run a model on a given input) and "what if" local explainability (a user's ability to correctly determine a model's prediction under local changes to the input, given knowledge of the model's original prediction). Through a user study with 1,000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. To track the relative interpretability of models, we employ a simple metric, the runtime operation count on the simulatability task. We find evidence that as the number of operations increases, participant accuracy on the local interpretability tasks decreases. In addition, this evidence is consistent with the common intuition that decision trees and logistic regression models are interpretable and are more interpretable than neural networks.
more »
« less
This content will become publicly available on June 18, 2025
Uncertainty Estimation for Tumor Prediction with Unlabeled Data
Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging given the complexity of gigapixel slides. Traditionally MIL interpretability is limited to identifying salient regions deemed pertinent for downstream tasks offering little insight to the end-user (pathologist) regarding the rationale behind these selections. To address this we propose Self-Interpretable MIL (SI-MIL) a method intrinsically designed for interpretability from the very outset. SI-MIL employs a deep MIL framework to guide an interpretable branch grounded on handcrafted pathological features facilitating linear predictions. Beyond identifying salient regions SI-MIL uniquely provides feature-level interpretations rooted in pathological insights for WSIs. Notably SI-MIL with its linear prediction constraints challenges the prevalent myth of an inevitable trade-off between model interpretability and performance demonstrating competitive results compared to state-of-the-art methods on WSI-level prediction tasks across three cancer types. In addition we thoroughly benchmark the local- and global-interpretability of SI-MIL in terms of statistical analysis a domain expert study and desiderata of interpretability namely user-friendliness and faithfulness.
more »
« less
- Award ID(s):
- 2144901
- PAR ID:
- 10537188
- Publisher / Repository:
- Conference on Computer Vision and Pattern Recognition (CVPR)
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose and test a novel graph learning-based explainable artificial intelligence (XAI) approach to address the challenge of developing explainable predictions of patient length of stay (LoS) in intensive care units (ICUs). Specifically, we address a notable gap in the literature on XAI methods that identify interactions between model input features to predict patient health outcomes. Our model intrinsically constructs a patient-level graph, which identifies the importance of feature interactions for prediction of health outcomes. It demonstrates state-of-the-art explanation capabilities based on identification of salient feature interactions compared with traditional XAI methods for prediction of LoS. We supplement our XAI approach with a small-scale user study, which demonstrates that our model can lead to greater user acceptance of artificial intelligence (AI) model-based decisions by contributing to greater interpretability of model predictions. Our model lays the foundation to develop interpretable, predictive tools that healthcare professionals can utilize to improve ICU resource allocation decisions and enhance the clinical relevance of AI systems in providing effective patient care. Although our primary research setting is the ICU, our graph learning model can be generalized to other healthcare contexts to accurately identify key feature interactions for prediction of other health outcomes, such as mortality, readmission risk, and hospitalizations.more » « less
-
Transparency and accountability have become major concerns for black-box machine learning (ML) models. Proper explanations for the model behavior increase model transparency and help researchers develop more accountable models. Graph neural networks (GNN) have recently shown superior performance in many graph ML problems than traditional methods, and explaining them has attracted increased interest. However, GNN explanation for link prediction (LP) is lacking in the literature. LP is an essential GNN task and corresponds to web applications like recommendation and sponsored search on web. Given existing GNN explanation methods only address node/graph-level tasks, we propose Path-based GNN Explanation for heterogeneous Link prediction (PaGE-Link) that generates explanations with connection interpretability, enjoys model scalability, and handles graph heterogeneity. Qualitatively, PaGE-Link can generate explanations as paths connecting a node pair, which naturally captures connections between the two nodes and easily transfer to human-interpretable explanations. Quantitatively, explanations generated by PaGE-Link improve AUC for recommendation on citation and user-item graphs by 9 - 35% and are chosen as better by 78.79% of responses in human evaluation.more » « less
-
Recommending products to users with intuitive explanations helps improve the system in transparency, persuasiveness, and satisfaction. Existing interpretation techniques include post-hoc methods and interpretable modeling. The former category could quantitatively analyze input contribution to model prediction but has limited interpretation faithfulness, while the latter could explain model internal mechanisms but may not directly attribute model predictions to input features. In this study, we propose a novelDualInterpretableRecommendation model called DIRECT, which integrates ideas of the two interpretation categories to inherit their advantages and avoid limitations. Specifically, DIRECT makes use of item descriptions as explainable evidence for recommendation. First, similar to the post-hoc interpretation, DIRECT could attribute the prediction of a user preference score to textual words of the item descriptions. The attribution of each word is related to its sentiment polarity and word importance, where a word is important if it corresponds to an item aspect that the user is interested in. Second, to improve the interpretability of embedding space, we propose to extract high-level concepts from embeddings, where each concept corresponds to an item aspect. To learn discriminative concepts, we employ a concept-bottleneck layer, and maximize the coding rate reduction on word-aspect embeddings by leveraging a word-word affinity graph extracted from a pre-trained language model. In this way, DIRECT simultaneously achieves faithful attribution and usable interpretation of embedding space. We also show that DIRECT achieves linear inference time complexity regarding the length of item reviews. We conduct experiments including ablation studies on five real-world datasets. Quantitative analysis, visualizations, and case studies verify the interpretability of DIRECT. Our code is available at:https://github.com/JacksonWuxs/DIRECT.more » « less
-
null (Ed.)Learning interpretable representations in an unsupervised setting is an important yet a challenging task. Existing unsupervised interpretable methods focus on extracting independent salient features from data. However they miss out the fact that the entanglement of salient features may also be informative. Acknowledging these entanglements can improve the interpretability, resulting in extraction of higher quality and a wider variety of salient features. In this paper, we propose a new method to enable Generative Adversarial Networks (GANs) to discover salient features that may be entangled in an informative manner, instead of extracting only disentangled features. Specifically, we propose a regularizer to punish the disagreement between the extracted feature interactions and a given dependency structure while training. We model these interactions using a Bayesian network, estimate the maximum likelihood parameters and calculate a negative likelihood score to measure the disagreement. Upon qualitatively and quantitatively evaluating the proposed method using both synthetic and real-world datasets, we show that our proposed regularizer guides GANs to learn representations with disentanglement scores competing with the state-of-the-art, while extracting a wider variety of salient features.more » « less