skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.


Title: A taxonomy of weight learning methods for statistical relational learning
Abstract Statistical relational learning (SRL) frameworks are effective at defining probabilistic models over complex relational data. They often use weighted first-order logical rules where the weights of the rules govern probabilistic interactions and are usually learned from data. Existing weight learning approaches typically attempt to learn a set of weights that maximizes some function of data likelihood; however, this does not always translate to optimal performance on a desired domain metric, such as accuracy or F1 score. In this paper, we introduce a taxonomy of search-based weight learning approaches for SRL frameworks that directly optimize weights on a chosen domain performance metric. To effectively apply these search-based approaches, we introduce a novel projection, referred to as scaled space (SS), that is an accurate representation of the true weight space. We show that SS removes redundancies in the weight space and captures the semantic distance between the possible weight configurations. In order to improve the efficiency of search, we also introduce an approximation of SS which simplifies the process of sampling weight configurations. We demonstrate these approaches on two state-of-the-art SRL frameworks: Markov logic networks and probabilistic soft logic. We perform empirical evaluation on five real-world datasets and evaluate them each on two different metrics. We also compare them against four other weight learning approaches. Our experimental results show that our proposed search-based approaches outperform likelihood-based approaches and yield up to a 10% improvement across a variety of performance metrics. Further, we perform an extensive evaluation to measure the robustness of our approach to different initializations and hyperparameters. The results indicate that our approach is both accurate and robust.  more » « less
Award ID(s):
1703331 2023495 1740850
NSF-PAR ID:
10315168
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Machine Learning
ISSN:
0885-6125
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Templated graphical models (TGMs) encode model structure using rules that capture recurring relationships between multiple random variables. While the rules in TGMs are interpretable, it is not clear how they can be used to generate explanations for the individual predictions of the model. Further, learning these rules from data comes with high computational costs: it typically requires an expensive combinatorial search over the space of rules and repeated optimization over rule weights. In this work, we propose a new structure learning algorithm, Explainable Structured Model Search (ESMS), that learns a templated graphical model and an explanation framework for its predictions. ESMS uses a novel search procedure to efficiently search the space of models and discover models that trade-off predictive accuracy and explainability. We introduce the notion of relational stability and prove that our proposed explanation framework is stable. Further, our proposed piecewise pseudolikelihood (PPLL) objective does not require re-optimizing the rule weights across models during each iteration of the search. In our empirical evaluation on three realworld datasets, we show that our proposed approach not only discovers models that are explainable, but also significantly outperforms existing state-out-the-art structure learning approaches. 
    more » « less
  2. Abstract

    Statistical relational learning (SRL) and graph neural networks (GNNs) are two powerful approaches for learning and inference over graphs. Typically, they are evaluated in terms of simple metrics such as accuracy over individual node labels. Complexaggregate graph queries(AGQ) involving multiple nodes, edges, and labels are common in the graph mining community and are used to estimate important network properties such as social cohesion and influence. While graph mining algorithms support AGQs, they typically do not take into account uncertainty, or when they do, make simplifying assumptions and do not build full probabilistic models. In this paper, we examine the performance of SRL and GNNs on AGQs over graphs with partially observed node labels. We show that, not surprisingly, inferring the unobserved node labels as a first step and then evaluating the queries on the fully observed graph can lead to sub-optimal estimates, and that a better approach is to compute these queries as an expectation under the joint distribution. We propose a sampling framework to tractably compute the expected values of AGQs. Motivated by the analysis of subgroup cohesion in social networks, we propose a suite of AGQs that estimate the community structure in graphs. In our empirical evaluation, we show that by estimating these queries as an expectation, SRL-based approaches yield up to a 50-fold reduction in average error when compared to existing GNN-based approaches.

     
    more » « less
  3. null (Ed.)
    Probabilistic soft logic (PSL) is a framework for instantiating probabilistic graphical models (PGM) representing complex relational data. Weighted first-order logical statements are used as templates for creating potential functions which define the PGM density. Traditionally, PSL constrains weights to be non-negative to ensure maximum a posteriori (MAP) inference is a tractable convex optimization problem. We propose three novel approaches to extending PSL's expressivity to allow negative weights. Notably, we propose the use of Gödel logic for defining potentials from negatively weighted rules. This method improves upon prior work on this topic by preserving both the convexity and scale of the MAP inference problem. Moreover, we show where each of the five methods discussed in this paper overlap and where they most differ. All negative methods are implemented in PSL, and we introduce a tunable synthetic dataset designed to empirically compare the performance of predictions. 
    more » « less
  4. Datasets involving multivariate event streams are prevalent in numerous applications. We present a novel framework for modeling temporal point processes called clock logic neural networks (CLNN) which learn weighted clock logic (wCL) formulas as interpretable temporal rules by which some events promote or inhibit other events. Specifically, CLNN models temporal relations between events using conditional intensity rates informed by a set of wCL formulas, which are more expressive than related prior work. Unlike conventional approaches of searching for generative rules through expensive combinatorial optimization, we design smooth activation functions for components of wCL formulas that enable a continuous relaxation of the discrete search space and efficient learning of wCL formulas using gradient-based methods. Experiments on synthetic datasets manifest our model's ability to recover the ground-truth rules and improve computational efficiency. In addition, experiments on real-world datasets show that our models perform competitively when compared with state-of-the-art models. 
    more » « less
  5. Explaining the results of Machine learning algorithms is crucial given the rapid growth and potential applicability of these methods in critical domains including healthcare, defense, autonomous driving, etc. In this paper, we address this problem in the context of Markov Logic Networks (MLNs) which are highly expressive statistical relational models that combine first-order logic with probabilistic graphical models. MLNs in general are known to be interpretable models, i.e., MLNs can be understood more easily by humans as compared to models learned by approaches such as deep learning. However, at the same time, it is not straightforward to obtain human-understandable explanations specific to an observed inference result (e.g. marginal probability estimate). This is because, the MLN provides a lifted interpretation, one that generalizes to all possible worlds/instantiations, which are not query/evidence specific. In this paper, we extract grounded-explanations, i.e., explanations defined w.r.t specific inference queries and observed evidence. We extract these explanations from importance weights defined over the MLN formulas that encode the contribution of formulas towards the final inference results. We validate our approach in real world problems related to analyzing reviews from Yelp, and show through user-studies that our explanations are richer than state-of-the-art non-relational explainers such as LIME . 
    more » « less