skip to main content

Title: DeepPM: Efficient Power Management in Edge Data Centers using Energy Storage
With the rapid development of the Internet of Things (IoT), computational workloads are gradually moving toward the internet edge for low latency. Due to significant workload fluctuations, edge data centers built in distributed locations suffer from resource underutilization and requires capacity underprovisioning to avoid wasting capital investment. The workload fluctuations, however, also make edge data centers more suitable for battery-assisted power management to counter the performance impact due to underprovisioning. In particular, the workload fluctuations allow the battery to be frequently recharged and made available for temporary capacity boosts. But, using batteries can overload the data center cooling system which is designed with a matching capacity of the power system. In this paper, we design a novel power management solution, DeepPM, that exploits the UPS battery and cold air inside the edge data center as energy storage to boost the performance. DeepPM uses deep reinforcement learning (DRL) to learn the data center thermal behavior online in a model-free manner and uses it on-the-fly to determine power allocation for optimum latency performance without overheating the data center. Our evaluation shows that DeepPM can improve latency performance by more than 50% compared to a power capping baseline while the server inlet temperature remains within safe operating limits (e.g., 32°C).  more » « less
Award ID(s):
1910208 1610471 1565474
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2020 IEEE 13th International Conference on Cloud Computing (CLOUD)
Page Range / eLocation ID:
370 to 379
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. An emerging trend in Internet of Things (IoT) applications is to move the computation (cyber) closer to the source of the data (physical). This paradigm is often referred to as edge computing. If edge resources are pooled together they can be used as decentralized shared resources for IoT applications, providing increased capacity to scale up computations and minimize end-to-end latency. Managing applications on these edge resources is hard, however, due to their remote, distributed, and (possibly) dynamic nature, which necessitates autonomous management mechanisms that facilitate application deployment, failure avoidance, failure management, and incremental updates. To address these needs, we present CHARIOT, which is orchestration middleware capable of autonomously managing IoT systems consisting of edge resources and applications. CHARIOT implements a three-layer architecture. The topmost layer comprises a system description language, the middle layer comprises a persistent data storage layer and the corresponding schema to store system information, and the bottom layer comprises a management engine that uses information stored persistently to formulate constraints that encode system properties and requirements, thereby enabling the use of Satisfiability Modulo Theories (SMT) solvers to compute optimal system (re)configurations dynamically at runtime. This paper describes the structure and functionality of CHARIOT and evaluates its efficacy as the basis for a smart parking system case study that uses sensors to manage parking spaces 
    more » « less
  2. Data center operators generally overprovision IT and cooling capacities to address unexpected utilization increases that can violate service quality commitments. This results in energy wastage. To reduce this wastage, we introduce HCP (Holistic Capacity Provisioner), a service latency aware management system for dynamically provisioning the server and cooling capacity. Short-term load prediction is used to adjust the online server capacity to concentrate the workload onto the smallest possible set of online servers. Idling servers are completely turned off based on a separate long-term utilization predictor. HCP targets data centers that use chilled air cooling and varies the cooling provided commensurately, using adjustable aperture tiles and speed control of the blower fans in the air handler. An HCP prototype supporting a server heterogeneity is evaluated with real-world workload traces/requests and realizes up to 32% total energy savings while limiting the 99th-percentile and average latency increases to at most 6.67% and 3.24%, respectively, against a baseline system where all servers are kept online. 
    more » « less
  3. The processing demands of current and emerging applications, such as image/video processing, are increasing due to the deluge of data, generated by mobile and edge devices. This raises challenges for a vast range of computing systems, starting from smart-phones and reaching cloud and data centers. Heterogeneous computing demonstrates its ability as an efficient computing model due to its capability to adapt to various workload requirements. Field programmable gate arrays (FPGAs) provide power and performance benefits and have been used in many application domains from embedded systems to the cloud. In this paper, we used a closely coupled CPU-FPGA heterogeneous system to accelerate a sliding window based image processing algorithm, Canny edge detector. We accelerated Canny using two different implementations: Code partitioned and data partitioned. In the data partitioned implementation, we proposed a weighted round-robin based algorithm that partitions input images and distributes the load between the CPU and the FPGA based on latency. The paper also compares the performance of the proposed accelerators with separate CPU and FPGA implementations. Using our hybrid CPU-FPGA based algorithm, we achieved a speedup of up to 4.8× over a CPU-only and up to 2.1× over a FPGA-only implementations. Moreover, the estimated total energy consumption of our algorithm is more efficient than a CPU-only implementation. Our results show a significant reduction in energy-delay product (EDP) compared to the CPU-only implementation, and comparable EDP results to the FPGA-only implementation. 
    more » « less
  4. Adoption of renewable energy in power grids introduces stability challenges in regulating the operation frequency of the electricity grid. Thus, electrical grid operators call for provisioning of frequency regulation services from end-user customers, such as data centers, to help balance the power grid’s stability by dynamically adjusting their energy consumption based on the power grid’s need. As renewable energy adoption grows, the average reward price of frequency regulation services has become much higher than that of the electricity cost. Therefore, there is a great cost incentive for data centers to provide frequency regulation service. Many existing techniques modulating data center power result in significant performance slowdown or provide a low amount of frequency regulation provision. We present PowerMorph , a tight QoS-aware data center power-reshaping framework, which enables commodity servers to provide practical frequency regulation service. The key behind PowerMorph  is using “complementary workload” as an additional knob to modulate server power, which provides high provision capacity while satisfying tight QoS constraints of latency-critical workloads. We achieve up to 58% improvement to TCO under common conditions, and in certain cases can even completely eliminate the data center electricity bill and provide a net profit. 
    more » « less
  5. The Internet of Things (IoT) requires distributed, large scale data collection via geographically distributed devices. While IoT devices typically send data to the cloud for processing, this is problematic for bandwidth constrained applications. Fog and edge computing (processing data near where it is gathered, and sending only results to the cloud) has become more popular, as it lowers network overhead and latency. Edge computing often uses devices with low computational capacity, therefore service frameworks and middleware are needed to efficiently compose services. While many frameworks use a top-down perspective, quality of service is an emergent property of the entire system and often requires a bottom up approach. We define services as multi-modal, allowing resource and performance tradeoffs. Different modes can be composed to meet an application's high level goal, which is modeled as a function. We examine a case study for counting vehicle traffic through intersections in Nashville. We apply object detection and tracking to video of the intersection, which must be performed at the edge due to privacy and bandwidth constraints. We explore the hardware and software architectures, and identify the various modes. This paper lays the foundation to formulate the online optimization problem presented by the system which makes tradeoffs between the quantity of services and their quality constrained by available resources. 
    more » « less