Answering complex First-Order Logical (FOL) queries on large-scale incomplete knowledge graphs (KGs) is an important yet challenging task. Recent advances embed logical queries and KG entities in the same space and conduct query answering via dense similarity search. However, most logical operators designed in previous studies do not satisfy the axiomatic system of classical logic, limiting their performance. Moreover, these logical operators are parameterized and thus require many complex FOL queries as training data, which are often arduous to collect or even inaccessible in most real-world KGs. We thus present FuzzQE, a fuzzy logic based logical query embedding framework for answering FOL queries over KGs. FuzzQE follows fuzzy logic to define logical operators in a principled and learning-free manner, where only entity and relation embeddings require learning. FuzzQE can further benefit from labeled complex logical queries for training. Extensive experiments on two benchmark datasets demonstrate that FuzzQE provides significantly better performance in answering FOL queries compared to state-of-the-art methods. In addition, FuzzQE trained with only KG link prediction can achieve comparable performance to those trained with extra complex query data.
more »
« less
An Evaluation of Strategies to Train More Efficient Backward-Chaining Reasoners
Knowledge bases traditionally require manual optimization to en- sure reasonable performance when answering queries. We build on previous work on training a deep learning model to learn heuristics for answering queries by comparing different representations of the sentences contained in knowledge bases. We decompose the problem into issues of representation, training, and control and propose solutions for each subproblem. We evaluate different con- figurations on three synthetic knowledge bases. In particular we compare a novel representation approach based on learning to max- imize similarity of logical atoms that unify and minimize similarity of atoms that do not unify, to two vectorization strategies taken from the automated theorem proving literature: a chain-based and a 3-term-walk strategy. We also evaluate the efficacy of pruning the search by ignoring rules with scores below a threshold.
more »
« less
- Award ID(s):
- 2051037
- PAR ID:
- 10522311
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400701412
- Page Range / eLocation ID:
- 206 to 213
- Format(s):
- Medium: X
- Location:
- Pensacola FL USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We consider the problem of answering queries about formulas of first-order logic based on background knowledge partially represented explicitly as other formulas, and partially represented as examples independently drawn from a fixed probability distribution. PAC semantics, introduced by Valiant, is one rigorous, general proposal for learning to reason in formal languages: although weaker than classical entailment, it allows for a powerful model theoretic framework for answering queries while requiring minimal assumptions about the form of the distribution in question. To date, however, the most significant limitation of that approach, and more generally most machine learning approaches with robustness guarantees, is that the logical language is ultimately essentially propositional, with finitely many atoms. Indeed, the theoretical findings on the learning of relational theories in such generality have been resoundingly negative. This is despite the fact that first-order logic is widely argued to be most appropriate for representing human knowledge. In this work, we present a new theoretical approach to robustly learning to reason in first-order logic, and consider universally quantified clauses over a countably infinite domain. Our results exploit symmetries exhibited by constants in the language, and generalize the notion of implicit learnability to show how queries can be computed against (implicitly) learned first-order background knowledge.more » « less
-
null (Ed.)Logical queries constitute an important subset of questions posed in knowledge graph question answering systems. Yet, effectively answering logical queries on large knowledge graphs remains a highly challenging problem. Traditional subgraph matching based methods might suffer from the noise and incompleteness of the underlying knowledge graph, often with a prolonged online response time. Recently, an alternative type of method has emerged whose key idea is to embed knowledge graph entities and the query in an embedding space so that the embedding of answer entities is close to that of the query. Compared with subgraph matching based methods, it can better handle the noisy or missing information in knowledge graph, with a faster online response. Promising as it might be, several fundamental limitations still exist, including the linear transformation assumption for modeling relations and the inability to answer complex queries with multiple variable nodes. In this paper, we propose an embedding based method (NewLook) to address these limitations. Our proposed method offers three major advantages. First (Applicability), it supports four types of logical operations and can answer queries with multiple variable nodes. Second (Effectiveness), the proposed NewLook goes beyond the linear transformation assumption, and thus consistently outperforms the existing methods. Third (Efficiency), compared with subgraph matching based methods, NewLook is at least 3 times faster in answering the queries; compared with the existing embed-ding based methods, NewLook bears a comparable or even faster online response and offline training time.more » « less
-
Moens, Marie-Francine; Huang, Xuanjing; Specia, Lucia; Yih, Scott Wen-tau (Ed.)Knowledge-based visual question answering (VQA) requires answering questions with external knowledge in addition to the content of images. One dataset that is mostly used in evaluating knowledge-based VQA is OK-VQA, but it lacks a gold standard knowledge corpus for retrieval. Existing work leverage different knowledge bases (e.g., ConceptNet and Wikipedia) to obtain external knowledge. Because of varying knowledge bases, it is hard to fairly compare models’ performance. To address this issue, we collect a natural language knowledge base that can be used for any VQA system. Moreover, we propose a Visual Retriever-Reader pipeline to approach knowledge-based VQA. The visual retriever aims to retrieve relevant knowledge, and the visual reader seeks to predict answers based on given knowledge. We introduce various ways to retrieve knowledge using text and images and two reader styles: classification and extraction. Both the retriever and reader are trained with weak supervision. Our experimental results show that a good retriever can significantly improve the reader’s performance on the OK-VQA challenge.more » « less
-
We present a scalable approach for Detecting Objects by transferring Common-sense Knowledge (DOCK) from source to target categories. In our setting, the training data for the source categories have bounding box annotations, while those for the target categories only have image-level annotations. Current state-of-the-art approaches focus on image-level visual or semantic similarity to adapt a detector trained on the source categories to the new target categories. In contrast, our key idea is to (i) use similarity not at the image-level, but rather at the region-level, and (ii) leverage richer common-sense (based on attribute, spatial, etc.) to guide the algorithm towards learning the correct detections. We acquire such common-sense cues automatically from readily-available knowledge bases without any extra human effort. On the challenging MS COCO dataset, we find that common-sense knowledge can substantially improve detection performance over existing transfer-learning baselines.more » « less
An official website of the United States government

