skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Consistent Answers of Aggregation Queries via SAT
The framework of database repairs and consistent answers to queries is a principled approach to managing inconsistent databases. We describe the first system able to compute the consistent answers of general aggregation queries with the COUNT( A ), COUNT (*), and SUM operators, and with or without grouping constructs. Our system uses reductions to optimization versions of Boolean satisfiability (SAT) and then leverages powerful SAT solvers. We carry out an extensive set of experiments on both synthetic and real-world data that demonstrate the usefulness and scalability of this approach.  more » « less
Award ID(s):
1814152
PAR ID:
10358316
Author(s) / Creator(s):
;
Date Published:
Journal Name:
2022 IEEE 38th International Conference on Data Engineering (ICDE)
Page Range / eLocation ID:
924 to 937
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A physical blocks world, despite its relative simplicity, requires (in fully interactive form) a rich set of functional capabilities, ranging from vision to natural language understanding. In this work we tackle spatial question answering in a holistic way, using a vision system, speech input and output mediated by an animated avatar, a dialogue system that robustly interprets spatial queries, and a constraint solver that derives answers based on 3-D spatial modeling. The contributions of this work include a semantic parser that maps spatial questions into logical forms consistent with a general approach to meaning representation, a dialogue manager based on a schema representation, and a constraint solver for spatial questions that provides answers in agreement with human perception. These and other components are integrated into a multi-modal human-computer interaction pipeline. 
    more » « less
  2. Individuals and organizations are using databases to store personal information at an unprecedented rate. This creates a quandary for data providers. They are responsible for protecting the privacy of individuals described in their database. On the other hand, data providers are sometimes required to provide statistics about their data instead of sharing it wholesale with strong assurances that these answers are correct and complete such as in regulatory filings for the US SEC and other goverment organizations. We introduce a system,ZKSQL, that provides authenticated answers to ad-hoc SQL queries with zero-knowledge proofs. Its proofs show that the answers are correct and sound with respect to the database's contents and they do not divulge any information about its input records. This system constructs proofs over the steps in a query's evaluation and it accelerates this process with authenticated set operations. We validate the efficiency of this approach over a suite of TPC-H queries and our results show that ZKSQL achieves two orders of magnitude speedup over the baseline. 
    more » « less
  3. Knot mosaics are a model of a quantum knot system. A knot mosaic is a m-by-n grid where each location on the grid may contain any of 11 possible tiles such that the final layout has closed loops. Oh et al. proved a recurrence relation of state matrices to count the number of m-by-n knot mosaics. Our contribution is to use ALLSAT solvers to count knot mosaics and to experimentally try different ways to encode the AT MOST ONE constraint in SAT. We plan to use our SAT method as a tool to list knot mosaics of interest for specific classes of knots. 
    more » « less
  4. Range aggregate queries (RAQs) are an integral part of many real-world applications, where, often, fast and approximate answers for the queries are desired. Recent work has studied answering RAQs using machine learning (ML) models, where a model of the data is learned to answer the queries. However, there is no theoretical understanding of why and when the ML based approaches perform well. Furthermore, since the ML approaches model the data, they fail to capitalize on any query specific information to improve performance in practice. In this paper, we focus on modeling "queries" rather than data and train neural networks to learn the query answers. This change of focus allows us to theoretically study our ML approach to provide a distribution and query dependent error bound for neural networks when answering RAQs. We confirm our theoretical results by developing NeuroSketch, a neural network framework to answer RAQs in practice. Extensive experimental study on real-world, TPC-benchmark and synthetic datasets show that NeuroSketch answers RAQs multiple orders of magnitude faster than state-of-the-art and with better accuracy. 
    more » « less
  5. Retrieving short, precise answers to non-factoid queries is an increasingly important task, especially for mobile and voice search. Many of these questions may have multiple or alternative answers. In an environment where answers are presented incrementally, this raises the question of how to generate a diverse ranking to cover these alternatives. Existing search diversification algorithms generate diverse document rankings using explicit or implicit methods based on topical similarity. The goal of this paper is to evaluate the impact of applying these existing document diversification frameworks to the problem of answer diversification to determine if topical diversity is related to answer diversity. Using two common diversification algorithms, xQUAD and PM-2, and three question answering test collections, we show that topic diversification can help to generate more effective rankings but is not consistent across different queries and test collections. 
    more » « less