skip to main content


Title: Learning Explainable Templated Graphical Model
Templated graphical models (TGMs) encode model structure using rules that capture recurring relationships between multiple random variables. While the rules in TGMs are interpretable, it is not clear how they can be used to generate explanations for the individual predictions of the model. Further, learning these rules from data comes with high computational costs: it typically requires an expensive combinatorial search over the space of rules and repeated optimization over rule weights. In this work, we propose a new structure learning algorithm, Explainable Structured Model Search (ESMS), that learns a templated graphical model and an explanation framework for its predictions. ESMS uses a novel search procedure to efficiently search the space of models and discover models that trade-off predictive accuracy and explainability. We introduce the notion of relational stability and prove that our proposed explanation framework is stable. Further, our proposed piecewise pseudolikelihood (PPLL) objective does not require re-optimizing the rule weights across models during each iteration of the search. In our empirical evaluation on three realworld datasets, we show that our proposed approach not only discovers models that are explainable, but also significantly outperforms existing state-out-the-art structure learning approaches.  more » « less
Award ID(s):
2023495
NSF-PAR ID:
10333980
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Uncertainty in artificial intelligence
ISSN:
1525-3384
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Statistical relational learning (SRL) frameworks are effective at defining probabilistic models over complex relational data. They often use weighted first-order logical rules where the weights of the rules govern probabilistic interactions and are usually learned from data. Existing weight learning approaches typically attempt to learn a set of weights that maximizes some function of data likelihood; however, this does not always translate to optimal performance on a desired domain metric, such as accuracy or F1 score. In this paper, we introduce a taxonomy of search-based weight learning approaches for SRL frameworks that directly optimize weights on a chosen domain performance metric. To effectively apply these search-based approaches, we introduce a novel projection, referred to as scaled space (SS), that is an accurate representation of the true weight space. We show that SS removes redundancies in the weight space and captures the semantic distance between the possible weight configurations. In order to improve the efficiency of search, we also introduce an approximation of SS which simplifies the process of sampling weight configurations. We demonstrate these approaches on two state-of-the-art SRL frameworks: Markov logic networks and probabilistic soft logic. We perform empirical evaluation on five real-world datasets and evaluate them each on two different metrics. We also compare them against four other weight learning approaches. Our experimental results show that our proposed search-based approaches outperform likelihood-based approaches and yield up to a 10% improvement across a variety of performance metrics. Further, we perform an extensive evaluation to measure the robustness of our approach to different initializations and hyperparameters. The results indicate that our approach is both accurate and robust. 
    more » « less
  2. Learning logical rules is critical to improving reasoning in KGs. This is due to their ability to provide logical and interpretable explanations when used for predictions, as well as their ability to generalize to other tasks, domains, and data. While recent methods have been proposed to learn logical rules, the majority of these methods are either restricted by their computational complexity and cannot handle the large search space of large-scale KGs, or show poor generalization when exposed to data outside the training set. In this paper, we propose an endto-end neural model for learning compositional logical rules called NCRL. NCRL detects the best compositional structure of a rule body, and breaks it into small compositions in order to infer the rule head. By recurrently merging compositions in the rule body with a recurrent attention unit, NCRL finally predicts a single rule head. Experimental results show that NCRL learns high-quality rules, as well as being generalizable. Specifically, we show that NCRL is scalable, efficient, and yields state-of-the-art results for knowledge graph completion on large-scale KGs. Moreover, we test NCRL for systematic generalization by learning to reason on small-scale observed graphs and evaluating on larger unseen ones. 
    more » « less
  3. null (Ed.)
    In this work, we examine online collective inference, the problem of maintaining and performing inference over a sequence of evolving graphical models. We utilize templated graphical models (TGM), a general class of graphical models expressed via templates and instantiated with data. A key challenge is minimizing the cost of instantiating the updated model. To address this, we define a class of exact and approximate context-aware methods for updating an existing TGM. These methods avoid a full re-instantiation by using the context of the updates to only add relevant components to the graphical model. Further, we provide stability bounds for the general online inference problem and regret bounds for a proposed approximation. Finally, we implement our approach in probabilistic soft logic, and test it on several online collective inference tasks. Through these experiments we verify the bounds on regret and stability, and show that our approximate online approach consistently runs two to five times faster than the offline alternative while, surprisingly, maintaining the quality of the predictions. 
    more » « less
  4. FOLD-R is an automated inductive learning algorithm for learning default rules for mixed (numerical and categorical) data. It generates an (explainable) normal logic program (NLP) rule set for classification tasks. We present an improved FOLD-R algorithm, called FOLD-R++, that significantly increases the efficiency and scalability of FOLD-R by orders of magnitude. FOLD-R++ improves upon FOLD-R without compromising or losing information in the input training data during the encoding or feature selection phase. The FOLD-R++ algorithm is competitive in performance with the widely-used XGBoost algorithm, however, unlike XGBoost, the FOLD-R++ algorithm produces an explainable model. FOLD-R++ is also competitive in performance with the RIPPER system, however, on large datasets FOLD-R++ outperforms RIPPER. We also create a powerful tool-set by combining FOLD-R++ with s(CASP)—a goal-directed answer set programming (ASP) execution engine—to make predictions on new data samples using the normal logic program generated by FOLD-R++. The s(CASP) system also produces a justification for the prediction. Experiments presented in this paper show that our improved FOLD-R++ algorithm is a significant improvement over the original design and that the s(CASP) system can make predictions in an efficient manner as well. 
    more » « less
  5. IEEE Open Journal of the Computer Society (Ed.)
    While neural networks have been achieving increasingly significant excitement in solving classification tasks such as natural language processing, their lack of interpretability becomes a great challenge for neural networks to be deployed in certain high-stakes human-centered applications. To address this issue, we propose a new approach for generating interpretable predictions by inferring a simple three-layer neural network with threshold activations, so that it can benefit from effective neural network training algorithms and at the same time, produce human-understandable explanations for the results. In particular, the hidden layer neurons in the proposed model are trained with floating point weights and binary output activations. The output neuron is also trainable as a threshold logic function that implements a disjunctive operation, forming the logical-OR of the first-level threshold logic functions. This neural network can be trained using state-of-the-art training methods to achieve high prediction accuracy. An important feature of the proposed architecture is that only a simple greedy algorithm is required to provide an explanation with the prediction that is human-understandable. In comparison with other explainable decision models, our proposed approach achieves more accurate predictions on a broad set of tabular data classification datasets. 
    more » « less