PerfDebug: Performance Debugging of Computation Skew in Dataflow Systems

Teoh, Jason; Gulzar, Muhammad Ali; Xu, Guoqing Harry; Kim, Miryung

doi:10.1145/3357223.3362727

Citation Details

PerfDebug: Performance Debugging of Computation Skew in Dataflow Systems

Performance is a key factor for big data applications, and much research has been devoted to optimizing these applications. While prior work can diagnose and correct data skew, the problem of computation skew---abnormally high computation costs for a small subset of input data---has been largely overlooked. Computation skew commonly occurs in real-world applications and yet no tool is available for developers to pinpoint underlying causes. To enable a user to debug applications that exhibit computation skew, we develop a post-mortem performance debugging tool. PerfDebug automatically finds input records responsible for such abnormalities in a big data application by reasoning about deviations in performance metrics such as job execution time, garbage collection time, and serialization time. The key to PerfDebug's success is a data provenance-based technique that computes and propagates record-level computation latency to keep track of abnormally expensive records throughout the pipeline. Finally, the input records that have the largest latency contributions are presented to the user for bug fixing. We evaluate PerfDebug via in-depth case studies and observe that remediation such as removing the single most expensive record or simple code rewrite can achieve up to 16X performance improvement. more »

Award ID(s):: 1764077 1763172

PAR ID:: 10173701

Author(s) / Creator(s):: Teoh, Jason; Gulzar, Muhammad Ali; Xu, Guoqing Harry; Kim, Miryung

Date Published:: 2019-11-20

Journal Name:: SoCC '19: Proceedings of the ACM Symposium on Cloud Computing

Page Range / eLocation ID:: 465 to 476

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1145/3357223.3362727

More Like this