skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Explaining Image Classifiers
We focus on explaining image classifiers, taking the work ofMothilal et al. 2021 (MMTS) as our point of departure. We observe that, although MMTS claim to be using the definition of explanation proposed by Halpern 2016, they do not quite do so. Roughly speaking, Halpern’s definition has a necessity clause and a sufficiency clause. MMTS replace the necessity clause by a requirement that, as we show, implies it. Halpern’s definition also allows agents to restrict the set of options considered.While these difference may seem minor, as we show, they can have a nontrivial impact on explanations.We also show that, essentially without change, Halpern’s definition can handle two issues that have proved difficultfor other approaches: explanations of absence (when, for example, an image classifier for tumors outputs “no tumor”) and explanations of rare events (such as tumors).  more » « less
Award ID(s):
2319186
PAR ID:
10613492
Author(s) / Creator(s):
;
Publisher / Repository:
International Joint Conferences on Artificial Intelligence Organization
Date Published:
ISBN:
978-1-956792-05-8
Page Range / eLocation ID:
264 to 272
Format(s):
Medium: X
Location:
Hanoi, Vietnam
Sponsoring Org:
National Science Foundation
More Like this
  1. Anomaly/Outlier detection (AD/OD) is often used in controversial applications to detect unusual behavior which is then further investigated or policed. This means an explanation of why something was predicted as an anomaly is desirable not only for individuals but also for the general population and policy-makers. However, existing explainable AI (XAI) methods are not well suited for Explainable Anomaly detection (XAD). In particular, most XAI methods provide instance-level explanations, whereas a model/global-level explanation is desirable for a complete understanding of the definition of normality or abnormality used by an AD algorithm. Further, existing XAI methods try to explain an algorithm’s behavior by finding an explanation of why an instance belongs to a category. However, by definition, anomalies/outliers are chosen because they are different from the normal instances. We propose a new style of model agnostic explanation, called contrastive explanation, that is designed specifically for AD algorithms which use semantic tags to create explanations. It addresses the novel challenge of providing a model-agnostic and global-level explanation by finding contrasts between the outlier group of instances and the normal group. We propose three formulations: (i) Contrastive Explanation, (ii) Strongly Contrastive Explanation, and (iii) Multiple Strong Contrastive Explanations. The last formulation is specifically for the case where a given dataset is believed to have many types of anomalies. For the first two formulations, we show the underlying problem is in the computational class P by presenting linear and polynomial time exact algorithms. We show that the last formulation is computationally intractable, and we use an integer linear program for that version to generate experimental results. We demonstrate our work on several data sets such as the CelebA image data set, the HateXplain language data set, and the COMPAS dataset on fairness. These data sets are chosen as their ground truth explanations are clear or well-known. 
    more » « less
  2. Feature attributions attempt to highlight what inputs drive predictive power. Good attributions or explanations are thus those that produce inputs that retain this predictive power; accordingly, evaluations of explanations score their quality of prediction. However, evaluations produce scores better than what appears possible from the values in the explanation for a class of explanations, called encoding explanations. Probing for encoding remains a challenge because there is no general characterization of what gives the extra predictive power. We develop a definition of encoding that identifies this extra predictive power via conditional dependence and show that the definition fits existing examples of encoding. This definition implies, in contrast to encoding explanations, that non-encoding explanations contain all the informative inputs used to produce the explanation, giving them a "what you see is what you get" property, which makes them transparent and simple to use. Next, we prove that existing scores (ROAR, FRESH, EVAL-X) do not rank non-encoding explanations above encoding ones, and develop STRIPE-X which ranks them correctly. After empirically demonstrating the theoretical insights, we use STRIPE-X to show that despite prompting an LLM to produce non-encoding explanations for a sentiment analysis task, the LLM-generated explanations encode. 
    more » « less
  3. This article reports on the existence of actual clause morphology and interpretation in selected Bantu languages. Essentially, we treat the actual clause as an embedded assertion whereby the utterer is committed not only to the truth of the proposition described by the actual clause: It must be the case that the event in the proposition cannot be unrealized (or describe a future state) at the time of the utterance. The Bantu languages in our sample mark the actual clause by a verbal prefix in a typical tense position on the lower verb. This prefix occurs as a single vowel or as a consonant/vowel combination. When the actual clause is a syntactic complement, it co-occurs with verbs that may be incompatible with indicative clauses. The clause is also semantically distinct from other clause types such as the infinitive and the subjunctive. Our analysis of actual clauses as assertions explains why they are not complements of factive verbs. We argue that the source of the speaker’s commitment to truth arises in part from the way actual clauses are licensed by the clauses they are dependent on. That is, we propose that actual clauses are licensed by a “contingent antecedent clause” which is taken to be a precondition for the actual clause assertion. Our approach generalizes to explain other non-complement uses of actual/narrative clause types, typically described as “narrative” tense in Bantu, which bears the exact same morphology 
    more » « less
  4. Abstract Brain tumor is a life-threatening disease and causes about 0.25 million deaths worldwide in 2020. Magnetic Resonance Imaging (MRI) is frequently used for diagnosing brain tumors. In medically underdeveloped regions, physicians who can accurately diagnose and assess the severity of brain tumors from MRI are highly lacking. Deep learning methods have been developed to assist physicians in detecting brain tumors from MRI and determining their subtypes. In existing methods, neural architectures are manually designed by human experts, which is time-consuming and labor-intensive. To address this problem, we propose to automatically search for high-performance neural architectures for classifying brain tumors from MRIs, by leveraging a Learning-by-Self-Explanation (LeaSE) architecture search method. LeaSE consists of an explainer model and an audience model. The explainer aims at searching for a highly performant architecture by encouraging the architecture to generate high-fidelity explanations of prediction outcomes, where explanations’ fidelity is evaluated by the audience model. LeaSE is formulated as a four-level optimization problem involving a sequence of four learning stages which are conducted end-to-end. We apply LeaSE for MRI-based brain tumor classification, including four classes: glioma, meningioma, pituitary tumor, and healthy, on a dataset containing 3264 MRI images. Results show that our method can search for neural architectures that achieve better classification accuracy than manually designed deep neural networks while having fewer model parameters. For example, our method achieves a test accuracy of 90.6% and an AUC of 95.6% with 3.75M parameters while the accuracy and AUC of a human-designed network—ResNet101—is 84.5% and 90.1% respectively with 42.56M parameters. In addition, our method outperforms state-of-the-art neural architecture search methods. 
    more » « less
  5. Abstract Although the existing causal inference literature focuses on the forward-looking perspective by estimating effects of causes, the backward-looking perspective can provide insights into causes of effects. In backward-looking causal inference, the probability of necessity measures the probability that a certain event is caused by the treatment given the observed treatment and outcome. Most existing results focus on binary outcomes. Motivated by applications with ordinal outcomes, we propose a general definition of the probability of necessity. However, identifying the probability of necessity is challenging because it involves the joint distribution of the potential outcomes. We propose a novel assumption of monotonic incremental treatment effect to identify the probability of necessity with ordinal outcomes. We also discuss the testable implications of this key identification assumption. When it fails, we derive explicit formulas of the sharp large-sample bounds on the probability of necessity. 
    more » « less