Query Log Compression for Workload Analytics

Xie, Ting; Chandola, Varun; Kennedy, Oliver

doi:https://doi.org/10.14778/3291264.3291265

Citation Details

Query Log Compression for Workload Analytics

Analyzing database access logs is a key part of performance tuning, intrusion detection, benchmark development, and many other database administration tasks. Unfortunately, it is common for production databases to deal with millions or more queries each day, so these logs must be summarized before they can be used. Designing an appropriate summary encoding requires trading off between conciseness and information content. For example: simple workload sampling may miss rare, but high impact queries. In this paper, we present LogR, a lossy log compression scheme suitable for use in many automated log analytics tools, as well as for human inspection. We formalize and analyze the space/fidelity trade-off in the context of a broader family of “pattern” and “pattern mixture” log encodings to which LogR belongs. We show through a series of experiments that LogR compressed encodings can be created efficiently, come with provable information-theoretic bounds on their accuracy, and outperform state-of-art log summarization strategies. more »

Award ID(s):: 1750460 1409551

PAR ID:: 10084497

Author(s) / Creator(s):: Xie, Ting; Chandola, Varun; Kennedy, Oliver

Date Published:: 2018-11-01

Journal Name:: Proceedings of the VLDB Endowment

Volume:: 12

Issue:: 3

ISSN:: 2150-8097

Page Range / eLocation ID:: 183 - 196

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/https://doi.org/10.14778/3291264.3291265

More Like this