skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Explaining AI Decisions Using Efficient Methods for Learning Sparse Boolean Formulae
In this paper, we consider the problem of learning Boolean formulae from examples obtained by actively querying an oracle that can label these examples as either positive or negative. This problem has received attention in both machine learning as well as formal methods communities, and it has been shown to have exponential worst-case complexity in the general case as well as for many restrictions. In this paper, we focus on learning sparse Boolean formulae which depend on only a small (but unknown) subset of the overall vocabulary of atomic propositions. We propose two algorithms—first, based on binary search in the Hamming space, and the second, based on random walk on the Boolean hypercube, to learn these sparse Boolean formulae with a given confidence. This assumption of sparsity is motivated by the problem of mining explanations for decisions made by artificially intelligent (AI) algorithms, where the explanation of individual decisions may depend on a small but unknown subset of all the inputs to the algorithm. We demonstrate the use of these algorithms in automatically generating explanations of these decisions. These explanations will make intelligent systems more understandable and accountable to human users, facilitate easier audits and provide diagnostic information in the case of failure. The proposed approach treats the AI algorithm as a black-box oracle; hence, it is broadly applicable and agnostic to the specific AI algorithm. We show that the number of examples needed for both proposed algorithms only grows logarithmically with the size of the vocabulary of atomic propositions. We illustrate the practical effectiveness of our approach on a diverse set of case studies.  more » « less
Award ID(s):
1740079 1750009
PAR ID:
10119085
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Journal of Automated Reasoning
ISSN:
0168-7433
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions’ truth values. In order to address such uncertainties, the algorithm maintains a probabilistic estimate about the truth value of the atomic propositions; it updates this estimate according to new sensory measurements that arrive from exploration of the environment. Additionally, the algorithm maintains a hypothesis reward machine, which acts as an estimate of the reward machine that encodes the task to be learned. As the agent explores the environment, the algorithm updates the hypothesis reward machine according to the obtained rewards and the estimate of the atomic propositions’ truth value. Finally, the algorithm uses a Q-learning procedure for the states of the hypothesis reward machine to determine an optimal policy that accomplishes the task. We prove that the algorithm successfully infers the reward machine and asymptotically learns a policy that accomplishes the respective task. 
    more » « less
  2. We give the first agnostic, efficient, proper learning algorithm for monotone Boolean functions. Given 2O~(n√/ε) uniformly random examples of an unknown function f:{±1}n→{±1}, our algorithm outputs a hypothesis g:{±1}n→{±1} that is monotone and (opt +ε)-close to f, where opt is the distance from f to the closest monotone function. The running time of the algorithm (and consequently the size and evaluation time of the hypothesis) is also 2O~(n√/ε), nearly matching the lower bound of [13]. We also give an algorithm for estimating up to additive error ε the distance of an unknown function f to monotone using a run-time of 2O~(n√/ε). Previously, for both of these problems, sample-efficient algorithms were known, but these algorithms were not run-time efficient. Our work thus closes this gap in our knowledge between the run-time and sample complexity.This work builds upon the improper learning algorithm of [17] and the proper semiagnostic learning algorithm of [40], which obtains a non-monotone Boolean-valued hypothesis, then “corrects” it to monotone using query-efficient local computation algorithms on graphs. This black-box correction approach can achieve no error better than 2 opt +ε information-theoretically; we bypass this barrier bya)augmenting the improper learner with a convex optimization step, andb)learning and correcting a real-valued function before rounding its values to Boolean. Our real-valued correction algorithm solves the “poset sorting” problem of [40] for functions over general posets with non-Boolean labels. 
    more » « less
  3. We explore eXplainable AI (XAI) to enhance user experience and understand the value of explanations in AI-driven pedagogical decisions within an Intelligent Pedagogical Agent (IPA). Our real-time and personalized explanations cater to students' attitudes to promote learning. In our empirical study, we evaluate the effectiveness of personalized explanations by comparing three versions of the IPA: (1) personalized explanations and suggestions, (2) suggestions but no explanations, and (3) no suggestions. Our results show the IPA with personalized explanations significantly improves students' learning outcomes compared to the other versions. 
    more » « less
  4. Interpretations of logical formulas over semirings (other than the Boolean semiring) have applications in various areas of computer science including logic, AI, databases, and security. Such interpretations provide richer information beyond the truth or falsity of a statement. Examples of such semirings include Viterbi semiring, min-max or access control semiring, tropical semiring, and fuzzy semiring. The present work investigates the complexity of constraint optimization problems over semirings. The generic optimization problem we study is the following: Given a propositional formula phi over n variable and a semiring (K,+, . ,0,1), find the maximum value over all possible interpretations of phi over K. This can be seen as a generalization of the well-known satisfiability problem (a propositional formula is satisfiable if and only if the maximum value over all interpretations/assignments over the Boolean semiring is 1). A related problem is to find an interpretation that achieves the maximum value. In this work, we first focus on these optimization problems over the Viterbi semiring, which we call optConfVal and optConf. We first show that for general propositional formulas in negation normal form, optConfVal and optConf are in FP^NP. We then investigate optConf when the input formula phi is represented in the conjunctive normal form. For CNF formulae, we first derive an upper bound on the value of optConf as a function of the number of maximum satisfiable clauses. In particular, we show that if r is the maximum number of satisfiable clauses in a CNF formula with m clauses, then its optConf value is at most 1/4^(m-r). Building on this we establish that optConf for CNF formulae is hard for the complexity class FP^NP[log]. We also design polynomial-time approximation algorithms and establish an inapproximability for optConfVal. We establish similar complexity results for these optimization problems over other semirings including tropical, fuzzy, and access control semirings. 
    more » « less
  5. Williams Brian; Chen Yiling; Neville Jennifer (Ed.)
    Interpretations of logical formulas over semirings (other than the Boolean semiring) have applications in various areas of computer science including logic, AI, databases, and security. Such interpretations provide richer information beyond the truth or falsity of a statement. Examples of such semirings include Viterbi semiring, min-max or access control semiring, tropical semiring, and fuzzy semiring. The present work investigates the complexity of constraint optimization problems over semirings. The generic optimization problem we study is the following: Given a propositional formula $$\varphi$$ over $$n$$ variable and a semiring $$(K,+,\cdot,0,1)$$, find the maximum value over all possible interpretations of $$\varphi$$ over $$K$$. This can be seen as a generalization of the well-known satisfiability problem (a propositional formula is satisfiable if and only if the maximum value over all interpretations/assignments over the Boolean semiring is 1). A related problem is to find an interpretation that achieves the maximum value. In this work, we first focus on these optimization problems over the Viterbi semiring, which we call \optrustval\ and \optrust. We first show that for general propositional formulas in negation normal form, \optrustval\ and {\optrust} are in $${\mathrm{FP}}^{\mathrm{NP}}$$. We then investigate {\optrust} when the input formula $$\varphi$$ is represented in the conjunctive normal form. For CNF formulae, we first derive an upper bound on the value of {\optrust} as a function of the number of maximum satisfiable clauses. In particular, we show that if $$r$$ is the maximum number of satisfiable clauses in a CNF formula with $$m$$ clauses, then its $$\optrust$$ value is at most $$1/4^{m-r}$$. Building on this we establish that {\optrust} for CNF formulae is hard for the complexity class $${\mathrm{FP}}^{\mathrm{NP}[\log]}$$. We also design polynomial-time approximation algorithms and establish an inapproximability for {\optrustval}. We establish similar complexity results for these optimization problems over other semirings including tropical, fuzzy, and access control semirings. 
    more » « less