skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Debugging Database Queries: A Survey of Tools, Techniques, and Users
Database management systems (or DBMSs) have been around for decades, and yet are still difficult to use, particularly when trying to identify and fix errors in user programs (or queries). We seek to understand what methods have been proposed to help people debug database queries, and whether these techniques have ultimately been adopted by DBMSs (and users). We conducted an interdisciplinary review of 112 papers and tools from the database, visualisation and HCI communities. To better understand whether academic and industry approaches are meeting the needs of users, we interviewed 20 database users (and some designers), and found surprising results. In particular, there seems to be a wide gulf between users' debugging strategies and the functionality implemented in existing DBMSs, as well as proposed in the literature. In response, we propose new design guidelines to help system designers to build features that more closely match users debugging strategies.  more » « less
Award ID(s):
1850115
PAR ID:
10301348
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
Page Range / eLocation ID:
1 to 16
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Artificial intelligence algorithms have been used to enhance a wide variety of products and services, including assisting human decision making in high-stake contexts. However, these algorithms are complex and have trade-offs, notably between prediction accuracy and fairness to population subgroups. This makes it hard for designers to understand algorithms and design products or services in a way that respects users' goals, values, and needs. We proposed a method to help designers and users explore algorithms, visualize their trade-offs, and select algorithms with trade-offs consistent with their goals and needs. We evaluated our method on the problem of predicting criminal defendants' likelihood to re-offend through (i) a large-scale Amazon Mechanical Turk experiment, and (ii) in-depth interviews with domain experts. Our evaluations show that our method can help designers and users of these systems better understand and navigate algorithmic trade-offs. This paper contributes a new way of providing designers the ability to understand and control the outcomes of algorithmic systems they are creating. 
    more » « less
  2. Database Management Systems play an indispensable role in modern cyberspace. While multiple fuzzing frameworks have been proposed in recent years to test relational (SQL) DBMSs to improve their security, non-relational (NoSQL) DBMSs have yet to experience the same scrutiny and lack an effective testing solution in general. In this work, we identify three limitations of existing approaches when extended to fuzz the DBMSs effectively in general: being non-generic, using static constraints, and generating loose data dependencies. Then, we propose effective solutions to address these limitations. We implement our solutions into an end-to-end fuzzing framework, BUZZBEE, which can effectively fuzz both relational and non-relational DBMSs. BUZZBEE successfully discovered 40 vulnerabilities in eight DBMSs of four different data models, of which 25 have been fixed with 4 new CVEs assigned. In our evaluation, BUZZBEE outperforms state-of-the-art generic fuzzers by up to 177% in terms of code coverage and discovers 30x more bugs than the second-best fuzzer for non-relational DBMSs, while achieving comparable results with specialized SQL fuzzers for the relational counterpart. 
    more » « less
  3. In database management systems (DBMSs) that handle multiple concurrent queries, adapting to fluctuating workloads is crucial. This flexibility allows the DBMS to revise decisions based on current workload and available resources. As memory availability changes with the arrival or completion of queries, having memory-intensive operators like the Hybrid Hash Join that dynamically adapt is vital. This paper introduces a new memory-adaptive Hash-Based join algorithm design implemented in Apache AsterixDB and evaluates its responsiveness to memory variability. 
    more » « less
  4. Query formulation is increasingly performed by systems that need to guess a user's intent (e.g. via spoken word interfaces). But how can a user know that the computational agent is returning answers to the right query? More generally, given that relational queries can become pretty complicated,how can we help users understand existing relational queries, whether human-generated or automatically generated? Now seems the right moment to revisit a topic that predates the birth of the relational model: developing visual metaphors that help users understand relational queries. This lecture-style tutorial surveys the keyvisual metaphors developed for visual representations of relational expressions.We will survey the history and state-of-the art of relationally-complete diagrammatic representations of relational queries, discuss the key visual metaphors developed in over a century of investigating diagrammatic languages, and organize the landscape by mapping their used visual alphabets to the syntax and semantics of Relational Algebra (RA) and Relational Calculus (RC). 
    more » « less
  5. Relational databases play an important role in business, science, and more. However, many users cannot fully unleash the analytical power of relational databases, because they are not familiar with database languages such as SQL. Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors. Our experiments on multiple datasets, as well as a user study with 24 participants, demonstrate that our approach can achieve better performance than multiple SOTA approaches. 
    more » « less