skip to main content


Title: Plan B - Design Methodology for Cyber-Physical Systems Robust to Timing Failures
Many Cyber-Physical Systems (CPS) have timing constraints that must be met by the cyber components (software and the network) to ensure safety. It is a tedious job to check if a CPS meets its timing requirement especially when they are distributed and the software and/or the underlying computing platforms are complex. Furthermore, the system design is brittle since a timing failure can still happen e.g., network failure, soft error bit flip, etc. In this paper, we propose a new design methodology called Plan B where timing constraints of the CPS are monitored at the runtime, and a proper backup routine is executed when a timing failure happens to ensure safety. We provide a model on how to express the desired timing behavior using a set of timing constructs in a C/C++ code and how to efficiently monitor them at the runtime. We showcase the effectiveness of our approach by conducting experiments on three case studies: 1) the full software stack for autonomous driving (Apollo), 2) a multi-agent system with 1/10th scale model robots, and 3) a quadrotor for search and rescue application. We show that the system remains safe and stable even when intentional faults are injected to cause a timing failure. We also demonstrate that the system can achieve graceful degradation when a less extreme timing failure happens.  more » « less
Award ID(s):
1645578
NSF-PAR ID:
10321993
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
ACM Transactions on Cyber-Physical Systems
ISSN:
2378-962X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cyber-physical systems (CPS) are required to satisfy safety constraints in various application domains such as robotics, industrial manufacturing systems, and power systems. Faults and cyber attacks have been shown to cause safety violations, which can damage the system and endanger human lives. Resilient architectures have been proposed to ensure safety of CPS under such faults and attacks via methodologies including redundancy and restarting from safe operating conditions. The existing resilient architectures for CPS utilize different mechanisms to guarantee safety, and currently, there is no common framework to compare them. Moreover, the analysis and design undertaken for CPS employing one architecture is not readily extendable to another. In this article, we propose a timing-based framework for CPS employing various resilient architectures and develop a common methodology for safety analysis and computation of control policies and design parameters. Using the insight that the cyber subsystem operates in one out of a finite number of statuses, we first develop a hybrid system model that captures CPS adopting any of these architectures. Based on the hybrid system, we formulate the problem of joint computation of control policies and associated timing parameters for CPS to satisfy a given safety constraint and derive sufficient conditions for the solution. Utilizing the derived conditions, we provide an algorithm to compute control policies and timing parameters relevant to the employed architecture. We also note that our solution can be applied to a wide class of CPS with polynomial dynamics and also allows incorporation of new architectures. We verify our proposed framework by performing a case study on adaptive cruise control of vehicles.

     
    more » « less
  2. Since many Cyber–Physical Systems (CPS) interact with the real world, they are safety- or mission- critical. Temporal specification languages like STL (Signal Temporal Logic) have been developed to capture the properties that built CPS must meet. However, the existing temporal logics/languages do not provide a natural way to express the tolerance with which the timing properties must be met. As a consequence of this, the specified properties may be vague, the ensuing CPS design may end up being over- or under-provisioned, and the validation of whether the built CPS meets the specified CPS properties may turn out to be erroneous. To address these issues, a run-time verification methodology is proposed, that allows users to explicitly specify the tolerance with which timing properties must be met. To ensure the correctness of measurement-based validation of a built CPS, this article: (i) proposes a test to determine if a given measurement system can validate the properties specified in TTL, and (ii) proposes a measurement-based testing methodology to provide one-sided guarantee that the built CPS meets the specified CPS properties. The guarantees are one-sided in the sense that when the measurement-based testing concludes that the properties are met, then they are guaranteed to be met (so not false positive). However, when the measurement-based testing concludes that the properties were not met, then they may have met (there can be false negative). In order to validate our claims, we built a model of flying paster (part of the printing press that swaps in a new roll of paper when the current roll is about to finish) using Arduino Mega 2560 and two Hansen brushed DC motors and specified the timing constraints among the various events in this system, along with the tolerances with which they should be met in TTL. We generated the testing logic and validated that we get no false positive, even though we encounter 4.04% false negative. The rate of false negatives can be reduced to be less than any arbitrary value by using more accurate measurement equipment. 
    more » « less
  3. null (Ed.)
    Defense mechanisms against network-level attacks are commonly based on the use of cryptographic techniques, such as lengthy message authentication codes (MAC) that provide data integrity guarantees. However, such mechanisms require significant resources (both computational and network bandwidth), which prevents their continuous use in resource-constrained cyber-physical systems (CPS). Recently, it was shown how physical properties of controlled systems can be exploited to relax these stringent requirements for systems where sensor measurements and actuator commands are transmitted over a potentially compromised network; specifically, that merely intermittent use of data authentication (i.e., at occasional time points during system execution), can still provide strong Quality-of-Control (QoC) guarantees even in the presence of false-data injection attacks, such as Man-in-the-Middle (MitM) attacks. Consequently, in this work, we focus on integrating security into existing resource-constrained CPS, in order to protect against MitM attacks on a system where a set of control tasks communicates over a real-time network with system sensors and actuators. We introduce a design-time methodology that incorporates requirements for QoC in the presence of attacks into end-to-end timing constraints for real-time control transactions, which include data acquisition and authentication, real-time network messages, and control tasks. This allows us to formulate a mixed integer linear programming-based method for direct synthesis of schedulable tasks and message parameters (i.e., deadlines and offsets) that do not violate timing requirements for the already deployed controllers, while adding a sufficient level of protection against network-based attacks; specifically, the synthesis method also provides suitable intermittent authentication policies that ensure the desired QoC levels under attack. To additionally reduce the security-related bandwidth overhead, we propose the use of cumulative message authentication at time instances when the integrity of messages from subsets of sensors should be ensured. Furthermore, we introduce a method for the opportunistic use of the remaining resources to further improve the overall QoC guarantees while ensuring system (i.e., task and message) schedulability. Finally, we demonstrate applicability and scalability of our methodology on synthetic automotive systems as well as a real-world automotive case-study. 
    more » « less
  4. Existing design techniques for providing security guarantees against network-based attacks in cyber-physical systems (CPS) are based on continuous use of standard cryptographic tools to ensure data integrity. This creates an apparent conflict with common resource limitations in these systems, given that, for instance, lengthy message authentication codes (MAC) introduce significant overheads. We present a framework to ensure both timing guarantees for real-time network messages and Quality-of-Control (QoC) in the presence of network-based attacks. We exploit physical properties of controlled systems to relax constant integrity enforcement requirements, and show how the problem of feasibility testing of intermittently authenticated real-time messages can be cast as a mixed integer linear programming problem. Besides scheduling a set of real-time messages with predefined authentication rates obtained from QoC requirements, we show how to optimally increase the overall system QoC while ensuring that all real-time messages are schedulable. Finally, we introduce an efficient runtime bandwidth allocation method, based on opportunistic scheduling, in order to improve QoC. We evaluate our framework on a standard benchmark designed for CAN bus, and show how an infeasible message set with strong security guarantees can be scheduled if dynamics of controlled systems are taken into account along with real-time requirements. 
    more » « less
  5. Formal verification of cyber-physical systems (CPS) is challenging because it has to consider real-time and concurrency aspects that are often absent in ordinary software. Moreover, the software in CPS is often complex and low-level, making it hard to assure that a formal model of the system used for verification is a faithful representation of the actual implementation, which can undermine the value of a verification result. To address this problem, we propose a methodology for building verifiable CPS based on the principle that a formal model of the software can be derivedautomaticallyfrom its implementation. Our approach requires that the system implementation is specified inLingua Franca(LF), a polyglot coordination language tailored for real-time, concurrent CPS, which we made amenable to the specification of safety properties via annotations in the code. The program structure and the deterministic semantics of LF enable automatic construction of formal axiomatic models directly from LF programs. The generated models are automatically checked using Bounded Model Checking (BMC) by the verification engineUclid5using theZ3SMT solver. The proposed technique enables checking a well-defined fragment of Safety Metric Temporal Logic (Safety MTL) formulas. To ensure the completeness of BMC, we present a method to derive an upper bound on the completeness threshold of an axiomatic model based on the semantics of LF. We implement our approach in the LF Verifierand evaluate it using a benchmark suite with 22 programs sampled from real-life applications and benchmarks for Erlang, Lustre, actor-oriented languages, and RTOSes. The LF Verifiercorrectly checks 21 out of 22 programs automatically.

     
    more » « less