Debugging missing answers for spark queries over nested data with breadcrumb

Diestelkämper, Ralf; Lee, Seokki; Glavic, Boris; Herschel, Melanie

doi:10.14778/3476311.3476331

Citation Details

Debugging missing answers for spark queries over nested data with breadcrumb

We present Breadcrumb, a system that aids developers in debugging queries through query-based explanations for missing answers. Given as input a query and an expected, but missing, query result, Breadcrumb identifies operators in the input query that are responsible for the failure to derive the missing answer. These operators form explanations that guide developers who can then focus their debugging efforts on fixing these parts of the query. Breadcrumb is implemented on top of Apache Spark. Our approach is the first that scales to big data dimensions and is capable of finding explanations for common errors in queries over nested and de-normalized data, e.g., errors based on misinterpreting schema semantics. more »

Award ID(s):: 2107107 1956123

PAR ID:: 10358641

Author(s) / Creator(s):: Diestelkämper, Ralf; Lee, Seokki; Glavic, Boris; Herschel, Melanie

Date Published:: 2021-07-01

Journal Name:: Proceedings of the VLDB Endowment

Volume:: 14

Issue:: 12

ISSN:: 2150-8097

Page Range / eLocation ID:: 2731 to 2734

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.14778/3476311.3476331

More Like this