VPP: The Vulnerability-Proportional Protection Paradigm Towards Reliable Autonomous Machines

Zishen, Wan; Yiming, Gan; Bo, Yu3; Arijit, Raychowdhury; Yuhao, Zhu

The next ubiquitous computing platform, after personal computers and smartphones, is likely one of the autonomous natures, such as drones, robots, and self-driving cars, which have moved from mere lab concepts to permeating almost every aspect of our soci- ety [16, 20, 25]. Behind the proliferation of autonomous machines is the critical need to ensure reliability [7, 22–24]. Almost every vendor, be it in the software, hardware, or systems segment, has to conform to functional safety standards when shipping products for automotives. Today’s resiliency solutions to autonomous machines, however, all make fundamental trade-offs between resiliency and cost, which manifests as high overhead in performance, energy, and chip area. For instance, hardware modular redundancy provides high safety but more than doubles the area and energy cost [1]. The reason is that today’s solutions are of the “one-size-fits-all” nature: they use the same protection scheme throughout the entire computing stack of autonomous machines. As a result, they have to accommodate the least robust component, leading to a high protection overhead. The insight of this paper is that for a resiliency solution to pro- vide high protection coverage while introducing little cost, we must exploit the inherent robustness variations in the domain-specific autonomous machine computing. In particular, we show that the different autonomous machine kernels differ significantly in their inherent robustness and performance. Building on top of that, we propose a Vulnerable-Proportional Protection (VPP) design paradigm, in which the protection budget, be it spatially (e.g., modular re- dundancy) or temporally (e.g., re-execution), should be inversely proportional to the inherent robustness of a task in the autonomous machine system. In stark contrast to the existing “one-size-fits-all” strategy, VPP wisely allocates the protection budget, thus achieving the same protection coverage with little overhead, which provides a blueprint design paradigm towards reliable autonomous machines

More Like this