skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Zygarde: Time-Sensitive On-Device Deep Inference and Adaptation on Intermittently-Powered Systems
We propose Zygarde --- which is an energy- and accuracy-aware soft real-time task scheduling framework for batteryless systems that flexibly execute deep learning tasks1 that are suitable for running on microcontrollers. The sporadic nature of harvested energy, resource constraints of the embedded platform, and the computational demand of deep neural networks (DNNs) pose a unique and challenging real-time scheduling problem for which no solutions have been proposed in the literature. We empirically study the problem and model the energy harvesting pattern as well as the trade-off between the accuracy and execution of a DNN. We develop an imprecise computing-based scheduling algorithm that improves the timeliness of DNN tasks on intermittently powered systems. We evaluate Zygarde using four standard datasets as well as by deploying it in six real-life applications involving audio and camera sensor systems. Results show that Zygarde decreases the execution time by up to 26% and schedules 9% -- 34% more tasks with up to 21% higher inference accuracy, compared to traditional schedulers such as the earliest deadline first (EDF).  more » « less
Award ID(s):
1840131
PAR ID:
10301950
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Volume:
4
Issue:
3
ISSN:
2474-9567
Page Range / eLocation ID:
1 to 29
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The concurrent execution of deep neural networks (DNN) inference tasks on intermittently-powered batteryless devices (IPDs) has recently garnered much attention due to its potential in a broad range of smart sensing applications. While the checkpointing mechanisms (CMs) provided by the state-of-the-art make this possible, scheduling inference tasks on IPDs is still a complex problem due to significant performance variations across DNN layers and CM choices. This complexity is further accentuated by dynamic environmental conditions and inherent resource constraints of IPDs. To tackle these challenges, we present MII, a framework designed for intermittence-aware inference and scheduling on IPDs. MII formulates the shutdown and live time functions of an IPD from profiling data, which our offline intermittence-aware search scheme uses to find optimal layer-wise CMs for each task. At runtime, MII enhances job success rates by dynamically making scheduling decisions to mitigate workload losses from power interruptions and adjusting these CMs in response to actual energy patterns. Our evaluation demonstrates the superiority of MII over the state-of-the-art. In controlled environments, MII achieves an average increase of 21% and 39% in successful jobs under stable and dynamic energy patterns. In real-world settings, MII achieves 33% and 24% more successful jobs indoors and outdoors. 
    more » « less
  2. Both energy-efficiency and real-time performance are critical requirements in many embedded systems applications such as self-driving car, robotic system, disaster response, and security/safety control. These systems entail a myriad of real-time tasks, where each task itself is a parallel task that can utilize multiple computing units at the same time. Driven by the increasing demand for parallel tasks, multi-core embedded processors are inevitably evolving to many-core. Existing work on real-time parallel tasks mostly focused on real-time scheduling without addressing energy consumption. In this paper, we address hard real-time scheduling of parallel tasks while minimizing their CPU energy consumption on multicore embedded systems. Each task is represented as a directed acyclic graph (DAG) with nodes indicating different threads of execution and edges indicating their dependencies. Our technique is to determine the execution speeds of the nodes of the DAGs to minimize the overall energy consumption while meeting all task deadlines. It incorporates a frequency optimization engine and the dynamic voltage and frequency scaling (DVFS) scheme into the classical real-time scheduling policies (both federated and global) and makes them energy-aware. The contributions of this paper thus include the first energy-aware online federated scheduling and also the first energy-aware global scheduling of DAGs. Evaluation using synthetic workload through simulation shows that our energy-aware real-time scheduling policies can achieve up to 68% energy-saving compared to classical (energy-unaware) policies. We have also performed a proof of concept system evaluation using physical hardware demonstrating the energy efficiency through our proposed approach. 
    more » « less
  3. Pellizzoni, Rodolfo (Ed.)
    Deep Neural Networks (DNNs) are becoming common in "learning-enabled" time-critical applications such as autonomous driving and robotics. One approach to protect DNN inference from adversarial actions and preserve model privacy/confidentiality is to execute them within trusted enclaves available in modern processors. However, running DNN inference inside limited-capacity enclaves while ensuring timing guarantees is challenging due to (a) large size of DNN workloads and (b) extra switching between "normal" and "trusted" execution modes. This paper introduces new time-aware scheduling schemes - DeepTrust^RT - to securely execute deep neural inferences for learning-enabled real-time systems. We first propose a variant of EDF (called DeepTrust^RT-LW) that slices each DNN layer and runs them sequentially in the enclave. However, due to extra context switch overheads of individual layer slices, we further introduce a novel layer fusion technique (named DeepTrust^RT-FUSION). Our proposed scheme provides hard real-time guarantees by fusing multiple layers of DNN workload from multiple tasks; thus allowing them to fit and run concurrently within the enclaves while maintaining real-time guarantees. We implemented and tested DeepTrust^RT ideas on the Raspberry Pi platform running OP-TEE+DarkNet-TZ DNN APIs and three DNN workloads (AlexNet-squeezed, Tiny Darknet, YOLOv3-tiny). Compared to the layer-wise partitioning approach (DeepTrust^RT-LW), DeepTrust^RT-FUSION can schedule up to 3x more tasksets and reduce context switches by up to 11.12x. We further demonstrate the efficacy of DeepTrust^RT using a flight controller (ArduPilot) case study and find that DeepTrust^RT-FUSION retains real-time guarantees where DeepTrust^RT-LW becomes unschedulable. 
    more » « less
  4. As more apps embrace AI, it is becoming increasingly common that multiple Deep Neural Networks (DNN)-powered apps may run at the same time on a mobile device. This paper explores scheduling in such multi-instance DNN scenarios, on general open mobile systems (e.g., common smartphones and tablets). Unlike closed systems (e.g., autonomous driving systems) where the set of co-run apps is known beforehand, the user of an open mobile system may install or uninstall arbitrary apps at any time, and a centralized solution is subject to adoption barriers. This work proposes the first-known decentralized application-level scheduling mechanism to address the problem. By leveraging the adaptivity of Deep Reinforcement Learning, the solution is shown to make the scheduling of co-run apps converge to a Nash equilibrium point, yielding a good balance of gains among the apps. The solution moreover automatically adapts to the running environment and the underlying OS and hardware. Experiments show that the solution consistently produces significant speedups and energy savings across DNN workloads, hardware configurations, and running scenarios. 
    more » « less
  5. Deep neural networks (DNNs) are increasingly used in time-critical, learning-enabled cyber-physical applications such as autonomous driving and robotics. Despite the growing use of various deep learning models, protecting DNN inference from adversarial threats while preserving model privacy and confidentiality remains a key concern for resource and timing-constrained autonomous cyber-physical systems. One potential solution, primarily used in general-purpose systems, is the execution of the DNN workloads withintrusted enclavesavailable on current off-the-shelf processors. However, ensuring temporal guarantees when running DNN inference within these enclaves poses significant challenges in real-time applications due to(a)the large computational and memory demands of DNN models and(b)the overhead introduced by frequent context switches between “normal” and “trusted” execution modes. This paper introduces new time-aware schemes for dynamic (EDF) and fixed-priority (RM) schedulers to preserve the confidentiality of DNN tasks by running them inside trusted enclaves. We first propose a technique thatsliceseach DNN layer and runs them sequentially in the enclave. However, due to the extra context switch overheads of individual layer slices, we further introduce a novellayer fusiontechnique. Layer fusion improves real-time guarantees by grouping multiple layers of DNN workload from multiple tasks, thus allowing them to fit and run concurrently within the enclaves while maintaining timing constraints. We implemented and tested our ideas on the Raspberry Pi platform running a DNN-enabled trusted operating system (OP-TEE with DarkNet-TZ) and three DNN architectures (AlexNet-squeezed, Tiny Darknet, YOLOv3-tiny). Compared to the layer-wise partitioning approach, layer fusion can(a)schedule up to 3x more tasksets for EDF and 5x for RM and(b)reduce context switches by up to 11.12x for EDF and by up to 11.06x for RM. 
    more » « less