Fill-in the gaps: Spatial-temporal models for missing data

Xue, Ji; Nie, Bin; Smirni, Evgenia

doi:10.23919/CNSM.2017.8255983

Citation Details

Fill-in the gaps: Spatial-temporal models for missing data

Effective workload characterization and prediction are instrumental for efficiently and proactively managing large systems. System management primarily relies on the workload information provided by underlying system tracing mechanisms that record system-related events in log files. However, such tracing mechanisms may temporarily fail due to various reasons, yielding “holes” in data traces. This missing data phenomenon significantly impedes the effectiveness of data analysis. In this paper, we study real-world data traces collected from over 80K virtual machines (VMs) hosted on 6K physical boxes in the data centers of a service provider. We discover that the usage series of VMs co-located on the same physical box exhibit strong correlation with one another, and that most VM usage series show temporal patterns. By taking advantage of the observed spatial and temporal dependencies, we propose a data-filling method to predict the missing data in the VM usage series. Detailed evaluation using trace data in the wild shows that the proposed method is sufficiently accurate as it achieves an average of 20% absolute percentage errors. We also illustrate its usefulness via a use case. more »

Award ID(s):: 1649087

PAR ID:: 10065572

Author(s) / Creator(s):: Xue, Ji; Nie, Bin; Smirni, Evgenia

Date Published:: 2017-11-01

Journal Name:: 13th International Conference on Network and Service Management, CNSM 2017

Page Range / eLocation ID:: 1 to 9

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.23919/CNSM.2017.8255983

More Like this