Deep learning and symbolic reasoning are complementary techniques for an intelligent system. However, principled combinations of these techniques are typically limited in scalability, rendering them ill-suited for real-world applications. We propose Scallop, a system that builds upon probabilistic deductive databases, to bridge this gap. The key insight underlying Scallop is a provenance framework that introduces a tunable parameter to specify the level of reasoning granularity. Scallop thereby i) generalizes exact probabilistic reasoning, ii) asymptotically reduces computational cost, and iii) provides relative accuracy guarantees. On synthetic tasks involving mathematical and logical reasoning, Scallop scales significantly better without sacrificing accuracy compared to DeepProbLog, a principled neural logic programming approach. Scallop also scales to a newly created real-world Visual Question Answering (VQA) benchmark that requires multi-hop reasoning, achieving 84.22% accuracy and outperforming two VQA-tailored models based on Neural Module Networks and transformers by 12.42% and 21.66% respectively.
more »
« less
Provenance for Probabilistic Logic Programs
Despite the emergence of probabilistic logic programming (PLP) languages for data driven applications, there are currently no debugging tools based on provenance for PLP programs. In this paper, we propose a novel provenance model and system, called P3 (Provenance for Probabilistic logic Programs) for analyzing PLP programs. P3 enables four types of provenance queries: traditional explanation queries, queries for finding the set of most important derivations within an approximate error, top-K most influential queries, and modification queries that enable us to modify tuple probabilities with fewest modifications to program or input data. We apply these queries into real-world scenarios and present theoretical analysis and practical algorithms for such queries. We have developed a prototype of P3, and our evaluation on real-world data demonstrates that the system can support a wide-range of provenance queries with explainable results. Moreover, the system maintains provenance and execute queries efficiently with low overhead.
more »
« less
- Award ID(s):
- 1704189
- PAR ID:
- 10485034
- Publisher / Repository:
- OpenProceedings
- Date Published:
- Journal Name:
- 23rd International Conference on Extending Database Technology (EDBT)
- ISSN:
- 2367-2005
- ISBN:
- 978-3-89318-083-7
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Answering complex First-Order Logical (FOL) queries on large-scale incomplete knowledge graphs (KGs) is an important yet challenging task. Recent advances embed logical queries and KG entities in the same space and conduct query answering via dense similarity search. However, most logical operators designed in previous studies do not satisfy the axiomatic system of classical logic, limiting their performance. Moreover, these logical operators are parameterized and thus require many complex FOL queries as training data, which are often arduous to collect or even inaccessible in most real-world KGs. We thus present FuzzQE, a fuzzy logic based logical query embedding framework for answering FOL queries over KGs. FuzzQE follows fuzzy logic to define logical operators in a principled and learning-free manner, where only entity and relation embeddings require learning. FuzzQE can further benefit from labeled complex logical queries for training. Extensive experiments on two benchmark datasets demonstrate that FuzzQE provides significantly better performance in answering FOL queries compared to state-of-the-art methods. In addition, FuzzQE trained with only KG link prediction can achieve comparable performance to those trained with extra complex query data.more » « less
-
Endrullis, Jörg; Schmitz, Sylvain (Ed.)Kleene Algebra with Tests (KAT) provides a framework for algebraic equational reasoning about imperative programs. The recent variant Guarded KAT (GKAT) allows to reason on non-probabilistic properties of probabilistic programs. Here we introduce an extension of this framework called approximate GKAT (aGKAT), which equips GKAT with a partially ordered monoid (real numbers) enabling to express satisfaction of (deterministic) properties except with a probability up to a certain bound. This allows to represent in equational reasoning "à la KAT" proofs of probabilistic programs based on the union bound, a technique from basic probability theory. We show how a propositional variant of approximate Hoare Logic (aHL), a program logic for union bound, can be soundly encoded in our system aGKAT. We then illustrate the use of aGKAT with an example of accuracy analysis from the field of differential privacy.more » « less
-
Using synchronous dynamical systems (SyDSs) as a formal model for networked social systems, we study the problem of inferring users’ choices in such systems. We observe that SyDSs with deterministic and probabilistic threshold functions as local functions can capture users’ choices in the context of contagion propagation in social networks. We use an active query mechanism where a user interacts with a system by submitting queries, and the responses to the queries are used to infer the local functions. We develop methods that provide provably efficient query sets for inferring both deterministic and probabilistic forms of threshold functions. We also present experimental results using real world social networks.more » « less
-
Probabilistic programs often trade accuracy for efficiency, and thus may, with a small probability, return an incorrect result. It is important to obtain precise bounds for the probability of these errors, but existing verification approaches have limitations that lead to error probability bounds that are excessively coarse, or only apply to first-order programs. In this paper we present Eris, a higher-order separation logic for proving error probability bounds for probabilistic programs written in an expressive higher-order language. Our key novelty is the introduction of error credits, a separation logic resource that tracks an upper bound on the probability that a program returns an erroneous result. By representing error bounds as a resource, we recover the benefits of separation logic, including compositionality, modularity, and dependency between errors and program terms, allowing for more precise specifications. Moreover, we enable novel reasoning principles such as expectation-preserving error composition, amortized error reasoning, and error induction. We illustrate the advantages of our approach by proving amortized error bounds on a range of examples, including collision probabilities in hash functions, which allow us to write more modular specifications for data structures that use them as clients. We also use our logic to prove correctness and almost-sure termination of rejection sampling algorithms. All of our results have been mechanized in the Coq proof assistant using the Iris separation logic framework and the Coquelicot real analysis library.more » « less
An official website of the United States government

