Power management in data centers is challenging because of fluctuating workloads and strict task completion time requirements. Recent resource provisioning systems, such as Borg and RC-Informed, pack tasks on servers to save power. However, current power optimization frameworks based on packing leave very little headroom for spikes, and the task completion times are compromised. In this paper, we design Goldilocks, a novel resource provisioning system for optimizing both power and task completion time by allocating tasks to servers in groups. Tasks hosted in containers are grouped together by running a graph partitioning algorithm. Containers communicating frequently are placed together, which improves the task completion times. We also leverage new findings on power consumption of modern- day servers to ensure that their utilizations are in a range where they are power-proportional. Both testbed implementation measurements and large-scale trace-driven simulations prove that Goldilocks outperforms all the previous works on data center power saving. Goldilocks saves power by 11.7%-26.2% depending on the workload, whereas the best of the implemented alternatives, Borg, saves 8.9%-22.8%. The energy per request for the Twitter content caching workload in Goldilocks is only 33% of RC-Informed. Finally, the best alternative in terms of task completion time, E-PVM, has 1.17-3.29 times higher task completion times than Goldilocks across different workloads.
more »
« less
A New Approach to Capacity Scaling Augmented with Unreliable Machine Learning Predictions
Modern data centers suffer from immense power consumption. As a result, data center operators have heavily invested in capacity-scaling solutions, which dynamically deactivate servers if the demand is low and activate them again when the workload increases. We analyze a continuous-time model for capacity scaling, where the goal is to minimize the weighted sum of flow time, switching cost, and power consumption in an online fashion. We propose a novel algorithm, called adaptive balanced capacity scaling (ABCS), that has access to black-box machine learning predictions. ABCS aims to adapt to the predictions and is also robust against unpredictable surges in the workload. In particular, we prove that ABCS is [Formula: see text] competitive if the predictions are accurate, and yet, it has a uniformly bounded competitive ratio even if the predictions are completely inaccurate. Finally, we investigate the performance of this algorithm on a real-world data set and carry out extensive numerical experiments, which positively support the theoretical results. Funding: This work was partially supported by the Division of Computing and Communication Foundations [Grant 2113027]. The authors also acknowledge financial support for this project from the Algorithm and Randomness Center–Transdisciplinary Research Institute for Advancing Data Science Fellowship at Georgia Tech.
more »
« less
- Award ID(s):
- 2113027
- PAR ID:
- 10421615
- Date Published:
- Journal Name:
- Mathematics of Operations Research
- ISSN:
- 0364-765X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Data centers have complex environments that undergo constant changes due to fluctuations in IT load, commissioning and decommissioning of IT equipment, heterogeneous rack architectures and varying environmental conditions. These dynamic factors often pose challenges in effectively provisioning cooling systems, resulting in higher energy consumption. To address this issue, it is crucial to consider data center thermal heterogeneity when allocating workloads and controlling cooling, as it can impact operational efficiency. Computational Fluid Dynamics (CFD) models are used to simulate data center heterogeneity and analyze the impact of two different cooling mechanisms on operational efficiency. This research focuses on comparing the cooling based on facility water for Rear Door Heat Exchanger (RDHx) and conventional Computer Room Air Conditioning (CRAH) systems in two different data center configurations. Efficiency is measured in terms of ΔT across facility water. Higher ΔT will result in efficient operation of chillers. The actual chiller efficiency is not calculated as it would depend on local ambient conditions in which the chiller is operated. The first data center model represents a typical enterpriselevel configuration where all servers and racks have homogeneous IT power. The second model represents a colocation facility where server/rack power configurations are randomly distributed. These models predict temperature variations at different locations based on IT workload and cooling parameters. Traditionally, CRAH configurations are selected based on total IT power consumption, rack power density, and required cooling capacity for the entire data center space. On the other hand, RDHx can be scaled based on individual rack power density, offering localized cooling advantages. Multiple workload distribution scenarios were simulated for both CRAH and RDHx-based data center models. The results showed that RDHx provides a uniform thermal profile across the data center, irrespective of server/rack power density or workload distribution. This characteristic reduces the risk of over- or under-provisioning racks when using RDHx. Operational efficiency is compared in terms of difference in supply and return temperature of facility water for CRAH and RDHx units based on spatial heat dissipation and workload distribution. RDHx demonstrated excellent cooling capabilities while maintaining a higher ΔT, resulting in reduced cooling energy consumption, operational carbon footprint (?), and water usage.more » « less
-
null (Ed.)With the rapid development of the Internet of Things (IoT), computational workloads are gradually moving toward the internet edge for low latency. Due to significant workload fluctuations, edge data centers built in distributed locations suffer from resource underutilization and requires capacity underprovisioning to avoid wasting capital investment. The workload fluctuations, however, also make edge data centers more suitable for battery-assisted power management to counter the performance impact due to underprovisioning. In particular, the workload fluctuations allow the battery to be frequently recharged and made available for temporary capacity boosts. But, using batteries can overload the data center cooling system which is designed with a matching capacity of the power system. In this paper, we design a novel power management solution, DeepPM, that exploits the UPS battery and cold air inside the edge data center as energy storage to boost the performance. DeepPM uses deep reinforcement learning (DRL) to learn the data center thermal behavior online in a model-free manner and uses it on-the-fly to determine power allocation for optimum latency performance without overheating the data center. Our evaluation shows that DeepPM can improve latency performance by more than 50% compared to a power capping baseline while the server inlet temperature remains within safe operating limits (e.g., 32°C).more » « less
-
Cloud platforms’ rapid growth raises significant concerns about their electricity consumption and resulting carbon emissions. Power capping is a known technique for limiting the power consumption of data centers where workloads are hosted. Today’s data center computer clusters co-locate latency-sensitive web and throughput-oriented batch workloads. When power capping is necessary, throttling only the batch tasks without restricting latency-sensitive web workloads is ideal because guaranteeing low response time for latency-sensitive workloads is a must due to Service-Level Objectives (SLOs) requirements. This paper proposes PADS, a hardware-agnostic workload-aware power capping system. Due to not relying on any hardware mechanism such as RAPL and DVFS, it can keep the power consumption of clusters equipped with heterogeneous architectures such as x86 and ARM below the enforced power limit while minimizing the impact on latency-sensitive tasks. It uses an application-performance model of both latency-sensitive and batch workloads to ensure power safety with controllable performance. Our power capping technique uses diagonal scaling and relies on using the control group feature of the Linux kernel. Our results indicate that PADS is highly effective in reducing power while respecting the tail latency requirement of the latency-sensitive workload. Furthermore, compared to state-of-the-art solutions, PADS demonstrates lower P95 latency, accompanied by a 90% higher effectiveness in respecting power limits.more » « less
-
Adoption of renewable energy in power grids introduces stability challenges in regulating the operation frequency of the electricity grid. Thus, electrical grid operators call for provisioning of frequency regulation services from end-user customers, such as data centers, to help balance the power grid’s stability by dynamically adjusting their energy consumption based on the power grid’s need. As renewable energy adoption grows, the average reward price of frequency regulation services has become much higher than that of the electricity cost. Therefore, there is a great cost incentive for data centers to provide frequency regulation service. Many existing techniques modulating data center power result in significant performance slowdown or provide a low amount of frequency regulation provision. We present PowerMorph , a tight QoS-aware data center power-reshaping framework, which enables commodity servers to provide practical frequency regulation service. The key behind PowerMorph is using “complementary workload” as an additional knob to modulate server power, which provides high provision capacity while satisfying tight QoS constraints of latency-critical workloads. We achieve up to 58% improvement to TCO under common conditions, and in certain cases can even completely eliminate the data center electricity bill and provide a net profit.more » « less
An official website of the United States government

