Summary Energy‐efficient scientific applications require insight into how high performance computing system features impact the applications' power and performance. This insight can result from the development of performance and power models. In this article, we use the modeling and prediction tool MuMMI (Multiple Metrics Modeling Infrastructure) and 10 machine learning methods to model and predict performance and power consumption and compare their prediction error rates. We use an algorithm‐based fault‐tolerant linear algebra code and a multilevel checkpointing fault‐tolerant heat distribution code to conduct our modeling and prediction study on the Cray XC40 Theta and IBM BG/Q Mira at Argonne National Laboratory and the Intel Haswell cluster Shepard at Sandia National Laboratories. Our experimental results show that the prediction error rates in performance and power using MuMMI are less than 10% for most cases. By utilizing the models for runtime, node power, CPU power, and memory power, we identify the most significant performance counters for potential application optimizations, and we predict theoretical outcomes of the optimizations. Based on two collected datasets, we analyze and compare the prediction accuracy in performance and power consumption using MuMMI and 10 machine learning methods.
more »
« less
Workload Shaping Energy Optimizations with Predictable Performance for Mobile Sensing
Energy-efficiency is a key concern in mobile sensing applications, such as those for tracking social interactions or physical activities. An attractive approach to saving energy is to shape the workload of the system by artificially introducing delays so that the workload would require less energy to process. However, adding delays to save energy may have a detrimental impact on user experience. To address this problem, we present Gratis, a novel paradigm for incorporating workload shaping energy optimizations in mobile sensing applications in an automated manner. Gratis adopts stream programs as a high-level abstraction whose execution is coordinated using an explicit power management policy. We present an expressive coordination language that can specify a broad range of workload-shaping optimizations. A unique property of the proposed power management policies is that they have predictable performance, which can be estimated at compile time, in a computationally efficient manner, from a small number of measurements. We have developed a simulator that can predict the energy with a average error of 7% and delay with a average error of 15%, even when applications have variable workloads. The simulator is scalable: hours of real-world traces can be simulated in a few seconds. Building on the simulator's accuracy and scalability, we have developed tools for configuring power management policies automatically. We have evaluated Gratis by developing two mobile applications and optimizing their energy consumption. For example, an application that tracks social interactions using speaker-identification techniques can run for only 7 hours without energy optimizations. However, when Gratis employs batching, scheduled concurrency, and adaptive sensing, the battery lifetime can be extended to 45 hours when the end-to-end deadline is one minute. These results demonstrate the efficacy of our approach to reduce energy consumption in mobile sensing applications.
more »
« less
- Award ID(s):
- 1750155
- PAR ID:
- 10094533
- Date Published:
- Journal Name:
- 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI)
- Page Range / eLocation ID:
- 177 to 188
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
A large amount of data is produced by mobile devices today. The rising computational abilities and sophisticated operating systems (OS) on these devices have allowed us to create applications that are able to leverage this data to deliver better services. But today’s mobile technology is heavily limited by low battery capacity and limited cooling capabilities, which has motivated a search for new ways to optimize for energy-efficiency. A challenge in conducting such optimizations for today’s mobile devices is to be able to make changes in complex OS and application software architectures. Middleware has been becoming an increasingly popular solution for inserting energy-efficient solutions and optimizations in a robust manner, without altering the OS or application code. This is because of the flexibility and standardization that can be achieved through middleware. In this paper, we discuss some powerful and promising developments in prototyping middleware for energy efficient and robust execution of a variety of applications on commodity mobile computing devices.more » « less
-
Mobile vision systems would benefit from the ability to situationally sacrifice image resolution to save system energy when imaging detail is unnecessary. Unfortunately, any change in sensor resolution leads to a substantial pause in frame delivery -- as much as 280 ms. Frame delivery is bottlenecked by a sequence of reconfiguration procedures and memory management in current operating systems before it resumes at the new resolution. This latency from reconfiguration impedes the adoption of otherwise beneficial resolution-energy tradeoff mechanisms. We propose Banner as a media framework that provides a rapid sensor resolution reconfiguration service as a modification to common media frameworks, e.g., V4L2. Banner completely eliminates the frame-to-frame reconfiguration latency (226 ms to 33 ms), i.e., removing the frame drop during sensor resolution reconfiguration. Banner also halves the end-to-end resolution reconfiguration latency (226 ms to 105 ms). This enables a more than 49% reduction of system power consumption by allowing continuous vision applications to reconfigure the sensor resolution to 480p compared with downsampling from 1080p to 480p, as measured in a cloud-based offloading workload running on a Jetson TX2 board. As a result, Banner unlocks unprecedented capabilities for mobile vision applications to dynamically reconfigure sensor resolutions to balance the energy efficiency and task accuracy tradeoff.more » « less
-
null (Ed.)With the growing performance and wide application of deep neural networks (DNNs), recent years have seen enormous efforts on DNN accelerator hardware design for platforms from mobile devices to data centers. The systolic array has been a popular architectural choice for many proposed DNN accelerators with hundreds to thousands of processing elements (PEs) for parallel computing. Systolic array-based DNN accelerators for datacenter applications have high power consumption and nonuniform workload distribution, which makes power delivery network (PDN) design challenging. Server-class multicore processors have benefited from distributed on-chip voltage regulation and heterogeneous voltage regulation (HVR) for improving energy efficiency while guaranteeing power delivery integrity. This paper presents the first work on HVR-based PDN architecture and control for systolic array-based DNN accelerators. We propose to employ a PDN architecture comprising heterogeneous on-chip and off-chip voltage regulators and multiple power domains. By analyzing patterns of typical DNN workloads via a modeling framework, we propose a DNN workload-aware dynamic PDN control policy to maximize system energy efficiency while ensuring power integrity. We demonstrate significant energy efficiency improvements brought by the proposed PDN architecture, dynamic control, and power gating, which lead to a more than five-fold reduction of leakage energy and PDN energy overhead for systolic array DNN accelerators.more » « less
-
To promote energy-efficient operations in residential and office buildings, non-intrusive load monitoring (NILM) techniques have been proposed to infer the fine-grained power consumption and usage patterns of appliances from power-line measurement data. Fine-grained monitoring of everyday appliances (such as toasters and coffee makers) can not only promote energy-efficient building operations, but also provide unique insights into the context and activities of individuals. Current building-level NILM techniques are unable to identify the consumption characteristics of relatively low-load appliances, whereas smart-plug based solutions incur significant deployment and maintenance costs. In this paper, we investigate an intermediate architecture, where smart circuit breakers provide measurements of aggregate power consumption at room (or section) level granularity. We then investigate techniques to identify the usage and energy consumption of individual appliances from such measurements. We first develop a novel correlation-based approach called CBPA to identify individual appliances based on both their unique transient and steady-state power signatures. While promising, CBPA fails when the set of candidate appliances is too large. To further improve the accuracy of appliance level usage estimation, we then propose a hybrid system called AARPA, which uses mobile sensing to first infer high-level activities of daily living (ADLs), and then uses knowledge of such ADLs to effectively reduce the set of candidate appliances that potentially contribute to the aggregate readings at any point. We evaluate two variants of this algorithm, and show, using real-life data traces gathered from 10 domestic users, that our fusion of mobile and power-line sensing is very promising: it identified all devices that were used in each data trace, and it identified the usage duration and energy consumption of low-load consumer appliances with 87% accuracy.more » « less
An official website of the United States government

