skip to main content

Title: Online Reconfiguration of Regularity-Based Resource Partitions in Cyber-Physical Systems
We consider the problem of resource provisioning for real-time cyber-physical applications in an open system environment where there does not exist a global resource scheduler that has complete knowledge of the real-time performance requirements of each individual application that shares the resources with the other applications. Regularity-based Resource Partition (RRP) model is an effective strategy to hierarchically partition and assign various resource slices among the applications. However, RRP model does not consider changes in resource requests from the applications at run time. To allow for the run time adaptation to change resource requirements, we consider in this paper the issues in online resource partition reconfiguration, including semantics issues that arise in configuration transitions that may cause application failures. Based on the reconfiguration semantics, we study the online resource reconfigurability problem under the RRP model where the availability factors of resource partitions may be reconfigured during run time. We formalize the Dynamic Partition Reconfiguration (DPR) problem and provide a solution to this problem. Extensive experiments have been conducted to evaluate the performance of the proposed approach in different scenarios. We also present a case study using the autonomous F1/10 model car; the controller of the F1/10 car requires resource adaptation to satisfy the computing needs of its PID controller and vision system under different operating conditions. Our implementation demonstrates the effectiveness and benefit of online resource partition reconfiguration using the DPR approach in a real system.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE Real-Time Systems Symposium
Page Range / eLocation ID:
495 to 507
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cloud providers such as Amazon and Microsoft have begun to support on-demand FPGA acceleration in the cloud, and hardware vendors will support FPGAs in future processors. At the same time, technology advancements such as 3D stacking, through-silicon vias (TSVs), and FinFETs have greatly increased FPGA density. The massive parallelism of current FPGAs can support not only extremely large applications, but multiple applications simultaneously as well. System support for FPGAs, however, is in its infancy. Unlike software, where resource configurations are limited to simple dimensions of compute, memory, and I/O, FPGAs provide a multi-dimensional sea of resources known as the FPGA fabric: logic cells, floating point units, memories, and I/O can all be wired together, leading to spatial constraints on FPGA resources. Current stacks either support only a single application or statically partition the FPGA fabric into fixed-size slots. These designs cannot efficiently support diverse workloads: the size of the largest slot places an artificial limit on application size, and oversized slots result in wasted FPGA resources and reduced concurrency. This paper presents AMORPHOS, which encapsulates user FPGA logic in morphable tasks, or Morphlets. Morphlets provide isolation and protection across mutually distrustful protection domains, extending the guarantees of software processes. Morphlets can morph, dynamically altering their deployed form based on resource requirements and availability. To build Morphlets, developers provide a parameterized hardware design that interfaces with AMORPHOS, along with a mesh, which specifies external resource requirements. AMORPHOS explores the parameter space, generating deployable Morphlets of varying size and resource requirements. AMORPHOS multiplexes Morphlets on the FPGA in both space and time to maximize FPGA utilization. We implement AMORPHOS on Amazon F1 [1] and Microsoft Catapult [92]. We show that protected sharing and dynamic scalability support on workloads such as DNN inference and blockchain mining improves aggregate throughput up to 4× and 23× on Catapult and F1 respectively. 
    more » « less
  2. Mobile devices such as drones and autonomous vehicles increasingly rely on object detection (OD) through deep neural networks (DNNs) to perform critical tasks such as navigation, target-tracking and surveillance, just to name a few. Due to their high complexity, the execution of these DNNs requires excessive time and energy. Low-complexity object tracking (OT) is thus used along with OD, where the latter is periodically applied to generate "fresh" references for tracking. However, the frames processed with OD incur large delays, which does not comply with real-time applications requirements. Offloading OD to edge servers can mitigate this issue, but existing work focuses on the optimization of the offloading process in systems where the wireless channel has a very large capacity. Herein, we consider systems with constrained and erratic channel capacity, and establish parallel OT (at the mobile device) and OD (at the edge server) processes that are resilient to large OD latency. We propose Katch-Up, a novel tracking mechanism that improves the system resilience to excessive OD delay. We show that this technique greatly improves the quality of the reference available to tracking, and boosts performance up to 33%. However, while Katch-Up significantly improves performance, it also increases the computing load of the mobile device. Hence, we design SmartDet, a low-complexity controller based on deep reinforcement learning (DRL) that learns to achieve the right trade-off between resource utilization and OD performance. SmartDet takes as input highly-heterogeneous context-related information related to the current video content and the current network conditions to optimize frequency and type of OD offloading, as well as Katch-Up utilization. We extensively evaluate SmartDet on a real-world testbed composed by a JetSon Nano as mobile device and a GTX 980 Ti as edge server, connected through a Wi-Fi link, to collect several network-related traces, as well as energy measurements. We consider a state-of-the-art video dataset (ILSVRC 2015 - VID) and state-of-the-art OD models (EfficientDet 0, 2 and 4). Experimental results show that SmartDet achieves an optimal balance between tracking performance – mean Average Recall (mAR) and resource usage. With respect to a baseline with full Katch-Up usage and maximum channel usage, we still increase mAR by 4% while using 50% less of the channel and 30% power resources associated with Katch-Up. With respect to a fixed strategy using minimal resources, we increase mAR by 20% while using Katch-Up on 1/3 of the frames. 
    more » « less
  3. Cyber-Physical Systems (CPS) have been increasingly subject to cyber-attacks including code injection attacks. Zero day attacks further exasperate the threat landscape by requiring a shift to defense in depth approaches. With the tightly coupled nature of cyber components with the physical domain, these attacks have the potential to cause significant damage if safety-critical applications such as automobiles are compromised. Moving target defense techniques such as instruction set randomization (ISR) have been commonly proposed to address these types of attacks. However, under current implementations an attack can result in system crashing which is unacceptable in CPS. As such, CPS necessitate proper control reconfiguration mechanisms to prevent a loss of availability in system operation. This paper addresses the problem of maintaining system and security properties of a CPS under attack by integrating ISR, detection, and recovery capabilities that ensure safe, reliable, and predictable system operation. Specifically, we consider the problem of detecting code injection attacks and reconfiguring the controller in real-time. The developed framework is demonstrated with an autonomous vehicle case study. 
    more » « less
  4. We consider the problem of optimal control of district cooling energy plants (DCEPs) consisting of multiple chillers, a cooling tower, and a thermal energy storage (TES), in the presence of time-varying electricity price. A straightforward application of model predictive control (MPC) requires solving a challenging mixed-integer nonlinear program (MINLP) because of the on/off of chillers and the complexity of the DCEP model. Reinforcement learning (RL) is an attractive alternative since its real-time control computation is much simpler. But designing an RL controller is challenging due to myriad design choices and computationally intensive training. In this paper, we propose an RL controller and an MPC controller for minimizing the electricity cost of a DCEP and compare them via simulations. The two controllers are designed to be comparable in terms of objective and information requirements. The RL controller uses a novel Q-learning algorithm that is based on least-squares policy iteration. We describe the design choices for the RL controller, including the choice of state space and basis functions, that are found to be effective. The proposed MPC controller does not need a mixed integer solver for implementation, but only a nonlinear program (NLP) solver. A rule-based baseline controller is also proposed to aid in comparison. Simulation results show that the proposed RL and MPC controllers achieve similar savings over the baseline controller, about 17%. 
    more » « less
  5. The Internet of Things will need to support ubiquitous and continuous connectivity to resource constrained and energy constrained devices. To this end, we consider the optimization of cryptographic protocols under energy harvesting conditions. Traditionally, computing using energy harvesting power sources is handled as a case of intermittent-computing: working towards the completion of a goal under uncertain energy supply. In our work we consider the often ignored case when there is harvested energy available but there are no useful operations to complete. In cryptographic protocols, this can occur while the protocol waits for the next message. To avoid waste, we partition cryptographic algorithms into an offline portion and an online portion, where only the online portion has a real-time dependency to the availability of data. The offline portion is precomputed with the result stored as a coupon for the remaining online operation. We show that this structure brings multiple benefits including decreased response latency, a smaller energy store requirement, and reduced energy waste in a harvester supported system. We present a case study of two canonical cryptographic applications: true random number generation and bulk-encryption. We analyze the precomputed implementations on an MSP430 with ferroelectric RAM and an ARM Cortex M4 with nonvolatile flash memory. Our solutions avoid energy waste during the offline phase, and they offer gains in energy efficiency during the online phase of up to 57 times for bulk-encryption and over 100 times for random number generation. 
    more » « less