skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: EdgeWise: Energy-Efficient CNN Computation on Edge Devices under Stochastic Communication Delays
This paper presents a framework to enable the energy-efficient execution of convolutional neural networks (CNNs) on edge devices. The framework consists of a pair of edge devices connected via a wireless network: a performance and energy-constrained device D as the first recipient of data, and an energy-unconstrained device N as an accelerator for D. Device D decides on-the-fly how to distribute the workload with the objective of minimizing its energy consumption while accounting for the inherent uncertainty in network delay and the overheads involved in data transfer. These challenges are tackled by adopting the data-driven modeling framework of Markov Decision Processes (MDP), whereby an optimal policy is consulted by D in O(1) time to make layer-by-layer assignment decisions. As a special case, a linear-time dynamic programming algorithm is also presented for finding optimal layer assignment at once, under the assumption that the network delay is constant throughout the execution of the application. The proposed framework is demonstrated on a platform comprised of a Raspberry PI 3 as D and an NVIDIA Jetson TX2 as N. An average improvement of 31% and 23% in energy consumption is achieved compared to the alternatives of executing the CNNs entirely on D and N. Two state-of-the-art methods were also implemented, and compared with the proposed methods.  more » « less
Award ID(s):
1652132 2008244
PAR ID:
10339229
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Embedded Computing Systems
ISSN:
1539-9087
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    With the recent advances in both machine learning and embedded systems research, the demand to deploy computational models for real-time execution on edge devices has increased substantially. Without deploying computational models on edge devices, the frequent transmission of sensor data to the cloud results in rapid battery draining due to the energy consumption of wireless data transmission. This rapid power dissipation leads to a considerable reduction in the battery lifetime of the system, therefore jeopardizing the real-world utility of smart devices. It is well-established that for difficult machine learning tasks, models with higher performance often require more computation power and thus are not power-efficient choices for deployment on edge devices. However, the trade-offs between performance and power consumption are not well studied. While numerous methods (e.g., model compression) have been developed to obtain an optimal model, these methods focus on improving the efficiency of a single model. In an entirely new direction, we introduce an effective method to find a combination of multiple models that are optimal in terms of power-efficiency and performance by solving an optimization problem in which both performance and power consumption are taken into account. Experimental results demonstrate that on the ImageNet dataset, we can achieve a 20% energy reduction with only 0.3% accuracy drop compared to Squeeze-and-Excitation Networks. Compared to a pruned convolutional neural network for human activity recognition, while consuming 1.7% less energy, our proposed policy achieves 1.3% higher accuracy. 
    more » « less
  2. This paper describes a novel framework for executing a network of trained deep neural network (DNN) models on commercial-off-the-shelf devices that are deployed in an IoT environment. The scenario consists of two devices connected by a wireless network: a user-end device (U), which is a low-end, energy and performance-limited processor, and a cloudlet (C), which is a substantially higher performance and energy-unconstrained processor. The goal is to distribute the computation of the DNN models between U and C to minimize the energy consumption of U while taking into account the variability in the wireless channel delay and the performance overhead of executing models in parallel. The proposed framework was implemented using an NVIDIA Jetson Nano for U and a Dell workstation with Titan Xp GPU as C. Experiments demonstrate significant improvements both in terms of energy consumption of U and processing delay. 
    more » « less
  3. Mobile devices such as smartphones and autonomous vehicles increasingly rely on deep neural networks (DNNs) to execute complex inference tasks such as image classification and speech recognition, among others. However, continuously executing the entire DNN on mobile devices can quickly deplete their battery. Although task offloading to cloud/edge servers may decrease the mobile device’s computational burden, erratic patterns in channel quality, network, and edge server load can lead to a significant delay in task execution. Recently, approaches based on split computing (SC) have been proposed, where the DNN is split into a head and a tail model, executed respectively on the mobile device and on the edge server. Ultimately, this may reduce bandwidth usage as well as energy consumption. Another approach, called early exiting (EE), trains models to embed multiple “exits” earlier in the architecture, each providing increasingly higher target accuracy. Therefore, the tradeoff between accuracy and delay can be tuned according to the current conditions or application demands. In this article, we provide a comprehensive survey of the state of the art in SC and EE strategies by presenting a comparison of the most relevant approaches. We conclude the article by providing a set of compelling research challenges. 
    more » « less
  4. Mobile edge computing (MEC) is an emerging paradigm that integrates computing resources in wireless access networks to process computational tasks in close proximity to mobile users with low latency. In this paper, we propose an online double deep Q networks (DDQN) based learning scheme for task assignment in dynamic MEC networks, which enables multiple distributed edge nodes and a cloud data center to jointly process user tasks to achieve optimal long-term quality of service (QoS). The proposed scheme captures a wide range of dynamic network parameters including non-stationary node computing capabilities, network delay statistics, and task arrivals. It learns the optimal task assignment policy with no assumption on the knowledge of the underlying dynamics. In addition, the proposed algorithm accounts for both performance and complexity, and addresses the state and action space explosion problem in conventional Q learning. The evaluation results show that the proposed DDQN-based task assignment scheme significantly improves the QoS performance, compared to the existing schemes that do not consider the effects of network dynamics on the expected long-term rewards, while scaling reasonably well as the network size increases. 
    more » « less
  5. Recent advances in Internet of Things (IoT) technologies have sparked significant interest toward developing learning-based sensing applications on embedded edge devices. These efforts, however, are being challenged by the complexities of adapting to unforeseen conditions in an open-world environment, mainly due to the intensive computational and energy demands exceeding the capabilities of edge devices. In this article, we propose OpenSense, an open-world time-series sensing framework for making inferences from time-series sensor data and achieving incremental learning on an embedded edge device with limited resources. The proposed framework is able to achieve two essential tasks, inference and incremental learning, eliminating the necessity for powerful cloud servers. In addition, to secure enough time for incremental learning and reduce energy consumption, we need to schedule sensing activities without missing any events in the environment. Therefore, we propose two dynamic sensor scheduling techniques: 1) a class-level period assignment scheduler that finds an appropriate sensing period for each inferred class and 2) a Q-learning-based scheduler that dynamically determines the sensing interval for each classification moment by learning the patterns of event classes. With this framework, we discuss the design choices made to ensure satisfactory learning performance and efficient resource usage. Experimental results demonstrate the ability of the system to incrementally adapt to unforeseen conditions and to efficiently schedule to run on a resource-constrained device. 
    more » « less