skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: White-box Induction From SVM Models: Explainable AI with Logic Programming
We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in local optima. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures the SVM model’s underlying logic and outperforms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics.  more » « less
Award ID(s):
1910131
PAR ID:
10283088
Author(s) / Creator(s):
;
Editor(s):
Ricca, Francesco; Russo, Alessandra
Date Published:
Journal Name:
Theory and practice of logic programming
Volume:
20
Issue:
5
ISSN:
1475-3081
Page Range / eLocation ID:
656 - 670
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The main objective of an unmanned aerial vehicle (UAV) path planning is to generate a flight path that links a start point to an endpoint in an indoor space avoiding obstacles.  Path planning is essential for many real-life applications such as an autonomous car, surveillance mission, farming robots, unmanned aerial vehicles package delivery, space exploration, and many others. To create an optimal path, we need to adopt a specific criterion to minimize the distance the UAV must travel such as the Euclidean distance. In this paper, we provide our initial idea of creating an optimal path for indoor UAV using both A* and the Late Acceptance Hill Climbing (LAHC) algorithms. We are adopting an indoor search environment with various complexity and utilize the Probabilistic Roadmap algorithm (PRM) as a search space for both algorithms. The basic idea following PRM is to generate random sample points in the space and search these points for an optimal path. The developed results show that the LAHC algorithm outperforms the A* algorithm. 
    more » « less
  2. null (Ed.)
    In this paper, we consider the distributed version of Support Vector Machine (SVM) under the coordinator model, where all input data (i.e., points in [Formula: see text] space) of SVM are arbitrarily distributed among [Formula: see text] nodes in some network with a coordinator which can communicate with all nodes. We investigate two variants of this problem, with and without outliers. For distributed SVM without outliers, we prove a lower bound on the communication complexity and give a distributed [Formula: see text]-approximation algorithm to reach this lower bound, where [Formula: see text] is a user specified small constant. For distributed SVM with outliers, we present a [Formula: see text]-approximation algorithm to explicitly remove the influence of outliers. Our algorithm is based on a deterministic distributed top [Formula: see text] selection algorithm with communication complexity of [Formula: see text] in the coordinator model. 
    more » « less
  3. Quantum linear system algorithms (QLSAs) have the potential to speed up algorithms that rely on solving linear systems. Interior point methods (IPMs) yield a fundamental family of polynomial-time algorithms for solving optimization problems. IPMs solve a Newton linear system at each iteration to compute the search direction; thus, QLSAs can potentially speed up IPMs. Due to the noise in contemporary quantum computers, quantum-assisted IPMs (QIPMs) only admit an inexact solution to the Newton linear system. Typically, an inexact search direction leads to an infeasible solution, so, to overcome this, we propose an inexact-feasible QIPM (IF-QIPM) for solving linearly constrained quadratic optimization problems. We also apply the algorithm to ℓ1-norm soft margin support vector machine (SVM) problems, and demonstrate that our algorithm enjoys a speedup in the dimension over existing approaches. This complexity bound is better than any existing classical or quantum algorithm that produces a classical solution. 
    more » « less
  4. Abstract Gene trees can be different from the species tree due to biological processes and inference errors. One way to obtain a species tree is to find one that maximizes some measure of similarity to a set of gene trees. The number of shared quartets between a potential species tree and gene trees provides a statistically justifiable score; if maximized properly, it could result in a statistically consistent estimator of the species tree under several statistical models of discordance. However, finding the median quartet score tree, one that maximizes this score, is NP-Hard, motivating several existing heuristic algorithms. These heuristics do not follow the hill-climbing paradigm used extensively in phylogenetics. In this paper, we make theoretical contributions that enable an efficient hill-climbing approach. Specifically, we show that a subtree of sizemcan be placed optimally on a tree of sizenin quasi-linear time with respect tonand (almost) independently ofm. This result enables us to perform subtree prune and regraft (SPR) rearrangements as part of a hill-climbing search. We show that this approach can slightly improve upon the results of widely-used methods such as ASTRAL in terms of the optimization score but not necessarily accuracy. 
    more » « less
  5. Stochastic Gradient Descent (SGD) is a valuable algorithm for large-scale machine learning, but has proven difficult to parallelize on conventional architectures because of communication and memory access issues. The HogWild series of mixed logically distributed and physically multi-threaded algorithms overcomes these issues for problems with sparse characteristics by using multiple local model vectors with asynchronous atomic updates. While this approach has proven effective for several reported examples, there are others, especially very sparse cases, that do not scale as well. This paper discusses an SGD Support Vector Machine (SVM) on a cacheless migrating thread architecture using the Hogwild algorithms as a framework. Our implementations on this novel architecture achieved superior hardware efficiency and scalability over that of a conventional cluster using MPI. Furthermore these improvements were gained using naive data partitioning techniques and hardware with substantially less compute capability than that present in conventional systems. 
    more » « less