skip to main content


Title: Workload Shaping Energy Optimizations with Predictable Performance for Mobile Sensing
Energy-efficiency is a key concern in mobile sensing applications, such as those for tracking social interactions or physical activities. An attractive approach to saving energy is to shape the workload of the system by artificially introducing delays so that the workload would require less energy to process. However, adding delays to save energy may have a detrimental impact on user experience. To address this problem, we present Gratis, a novel paradigm for incorporating workload shaping energy optimizations in mobile sensing applications in an automated manner. Gratis adopts stream programs as a high-level abstraction whose execution is coordinated using an explicit power management policy. We present an expressive coordination language that can specify a broad range of workload-shaping optimizations. A unique property of the proposed power management policies is that they have predictable performance, which can be estimated at compile time, in a computationally efficient manner, from a small number of measurements. We have developed a simulator that can predict the energy with a average error of 7% and delay with a average error of 15%, even when applications have variable workloads. The simulator is scalable: hours of real-world traces can be simulated in a few seconds. Building on the simulator's accuracy and scalability, we have developed tools for configuring power management policies automatically. We have evaluated Gratis by developing two mobile applications and optimizing their energy consumption. For example, an application that tracks social interactions using speaker-identification techniques can run for only 7 hours without energy optimizations. However, when Gratis employs batching, scheduled concurrency, and adaptive sensing, the battery lifetime can be extended to 45 hours when the end-to-end deadline is one minute. These results demonstrate the efficacy of our approach to reduce energy consumption in mobile sensing applications.  more » « less
Award ID(s):
1750155
NSF-PAR ID:
10094533
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI)
Page Range / eLocation ID:
177 to 188
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Mobile vision systems would benefit from the ability to situationally sacrifice image resolution to save system energy when imaging detail is unnecessary. Unfortunately, any change in sensor resolution leads to a substantial pause in frame delivery -- as much as 280 ms. Frame delivery is bottlenecked by a sequence of reconfiguration procedures and memory management in current operating systems before it resumes at the new resolution. This latency from reconfiguration impedes the adoption of otherwise beneficial resolution-energy tradeoff mechanisms. We propose Banner as a media framework that provides a rapid sensor resolution reconfiguration service as a modification to common media frameworks, e.g., V4L2. Banner completely eliminates the frame-to-frame reconfiguration latency (226 ms to 33 ms), i.e., removing the frame drop during sensor resolution reconfiguration. Banner also halves the end-to-end resolution reconfiguration latency (226 ms to 105 ms). This enables a more than 49% reduction of system power consumption by allowing continuous vision applications to reconfigure the sensor resolution to 480p compared with downsampling from 1080p to 480p, as measured in a cloud-based offloading workload running on a Jetson TX2 board. As a result, Banner unlocks unprecedented capabilities for mobile vision applications to dynamically reconfigure sensor resolutions to balance the energy efficiency and task accuracy tradeoff. 
    more » « less
  2. Energy-efficient visual sensing is of paramount importance to enable battery-backed low power IoT and mobile applications. Unfortunately, modern image sensors still consume hundreds of milliwatts of power, mainly due to analog readout. This is because current systems always supply a fixed voltage to the sensor’s analog circuitry, leading to higher power profiles. In this work, we propose to aggressively scale the analog voltage supplied to the camera as a means to significantly reduce sensor power consumption. To that end, we characterize the power and fidelity implications of analog voltage scaling on three off-the-shelf image sensors. Our characterization reveals that analog voltage scaling reduces sensor power but also degrades image quality. Furthermore, the degradation in image quality situationally affects the task accuracy of vision applications. We develop a visual streaming pipeline that flexibly allows application developers to dynamically adapt sensor voltage on a frame-by-frame basis. We develop a voltage controller that programmatically generates desired sensor voltage based on application request. We integrate our voltage controller into the existing RPi-based video streaming IoT pipeline. On top of this, we develop runtime support for flexible voltage specification from vision applications. Evaluating the system over a wide range of voltage scaling policies on popular vision tasks reveals that Squint imaging can deliver up to 73% sensor power savings, while maintaining reasonable task fidelity. Our artifacts are available at: https://gitlab.com/squint1/squint-ae-public 
    more » « less
  3. High-resolution simulations can deliver great visual quality, but they are often limited by available memory, especially on GPUs. We present a compiler for physical simulation that can achieve both high performance and significantly reduced memory costs, by enabling flexible and aggressive quantization. Low-precision ("quantized") numerical data types are used and packed to represent simulation states, leading to reduced memory space and bandwidth consumption. Quantized simulation allows higher resolution simulation with less memory, which is especially attractive on GPUs. Implementing a quantized simulator that has high performance and packs the data tightly for aggressive storage reduction would be extremely labor-intensive and error-prone using a traditional programming language. To make the creation of quantized simulation practical, we have developed a new set of language abstractions and a compilation system. A suite of tailored domain-specific optimizations ensure quantized simulators often run as fast as the full-precision simulators, despite the overhead of encoding-decoding the packed quantized data types. Our programming language and compiler, based on Taichi , allow developers to effortlessly switch between different full-precision and quantized simulators, to explore the full design space of quantization schemes, and ultimately to achieve a good balance between space and precision. The creation of quantized simulation with our system has large benefits in terms of memory consumption and performance, on a variety of hardware, from mobile devices to workstations with high-end GPUs. We can simulate with levels of resolution that were previously only achievable on systems with much more memory, such as multiple GPUs. For example, on a single GPU, we can simulate a Game of Life with 20 billion cells (8× compression per pixel), an Eulerian fluid system with 421 million active voxels (1.6× compression per voxel), and a hybrid Eulerian-Lagrangian elastic object simulation with 235 million particles (1.7× compression per particle). At the same time, quantized simulations create physically plausible results. Our quantization techniques are complementary to existing acceleration approaches of physical simulation: they can be used in combination with these existing approaches, such as sparse data structures, for even higher scalability and performance. 
    more » « less
  4. Utilization of Internet in everyday life has made us vulnerable in terms of security and privacy of our data and systems. For example, large-scale data breaches have occurred at Yahoo and Equifax because of lacking of robust and secure data protection within systems. Therefore, it is imperative to find solutions to further boost data security and protect privacy of our systems. To this end, we propose to authenticate users by utilizing score-level fusions based on mouse dynamics (e.g., mouse movement on a screen) and widget interactions (e.g., when clicking or hovering over different icons on a screen) on two novel datasets. In this study, we focus on two common applications, PayPal (a money transaction website) and Facebook (a social media platform). Though we fuse the same modalities for both applications, the purpose of investigating PayPal is to demonstrate how we can authenticate users when the users interact with the app for only a short period of time, while the purpose of investigating Facebook is to authenticate users based on social media browsing activities. We have a total of 10 users for PayPal with an average of 12 minutes of data per user and a total of 15 users for Facebook with an average of 2 hours of data per user. By fusing a single mouse trajectory with the associated widget interactions that occur during the trajectory, our mean EERs (Equal Error Rates) with a score-level fusion of mouse dynamics and widget interactions are 7.64% (SVM-rbf) and 3.25% (GBM), for PayPal, and 5.49% (SVM-rbf) and 2.54% (GBM), for Facebook. To further improve the performance of our fusion, we combine decision scores from multiple consecutive trajectories, which yields a 0% mean EER after 11 decision scores across all the users for both PayPal and Facebook. 
    more » « less
  5. Energy efficiency has emerged as a key concern for modern processor design, especially when it comes to embedded and mobile devices. It is vital to accurately quantify the power consumption of different micro-architectural components in a CPU. Traditional RTL or gate-level power estimation is too slow for early design-space exploration studies. By contrast, existing architecture-level power models suffer from large inaccuracies. Recently, advanced machine learning techniques have been proposed for accurate power modeling. However, existing approaches still require slow RTL simulations, have large training overheads or have only been demonstrated for fixed-function accelerators and simple in-order cores with predictable behavior. In this work, we present a novel machine learning-based approach for microarchitecture-level power modeling of complex CPUs. Our approach requires only high-level activity traces obtained from microarchitecture simulations. We extract representative features and develop low-complexity learning formulations for different types of CPU-internal structures. Cycle-accurate models at the sub-component level are trained from a small number of gate-level simulations and hierarchically composed to build power models for complete CPUs. We apply our approach to both in-order and out-of-order RISC-V cores. Cross-validation results show that our models predict cycle-by-cycle power consumption to within 3% of a gate-level power estimation on average. In addition, our power model for the Berkeley Out-of-Order (BOOM) core trained on micro-benchmarks can predict the cycle-by-cycle power of real-world applications with less than 3.6% mean absolute error. 
    more » « less