ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud

Shuang Chen, Angela Jin

Many cloud services have Quality-of-Service (QoS) requirements; most requests have to to complete within a given latency constraint. Recently, researchers have begun to investigate whether it is possible to meet QoS while attempting to save power on a per-request basis. Existing work shows that one can indeed hand-tune a request latency predictor offline for a particular cloud application, and consult it at runtime to modulate CPU voltage and frequency, resulting in substantial power savings. In this paper, we propose ReTail, an automated and general solution for request-level power management of latency-critical services with QoS constraints. We present a systematic process to select the features of any given application that best correlate with its request latency. ReTail uses these features to predict latency, and adjust CPU’s power consumption. ReTail’s predictor is trained fully at runtime. We show that unlike previous findings, simple techniques perform better than complex machine learning models, when using the right input features. For a web search engine, ReTail outperforms prior mechanisms based on complex hand-tuned predictors for that application domain. Furthermore, ReTail’s systematic approach also yields superior power savings across a diverse set of cloud applications.

More Like this