NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Metaverse as a Service: Megascale Social 3D on the Cloud

https://doi.org/10.1145/3620678.3624662

Haeberlen, Andreas; Phan, Linh Thi; McGuire, Morgan (October 2023, ACM)
CrossTalk: Making Low-Latency Fault Tolerance Cheap by Exploiting Redundant Networks

https://doi.org/10.1145/3609436

Loveless, Andrew; Phan, Linh Thi; Erickson, Lisa; Dreslinski, Ronald; Kasikci, Baris (October 2023, ACM Transactions on Embedded Computing Systems)

Real-time embedded systems perform many important functions in the modern world. A standard way to tolerate faults in these systems is with Byzantine fault-tolerant (BFT) state machine replication (SMR), in which multiple replicas execute the same software and their outputs are compared by the actuators. Unfortunately, traditional BFT SMR protocols areslow, requiring replicas to exchange sensor data back and forth over multiple rounds in order to reach agreement before each execution. The state of the art in reducing the latency of BFT SMR iseager execution, in which replicas execute on data from different sensors simultaneously on different processor cores. However, this technique results in 3–5× higher computation overheads compared to traditional BFT SMR systems, significantly limiting schedulability. We presentCrossTalk, a new BFT SMR protocol that leverages the prevalence of redundant switched networks in embedded systems to reduce latency without added computation. The key idea is to use specific algorithms to move messages between redundant network planes (which many systems already possess) as the messages travel from the sensors to the replicas. As a result,CrossTalkcan ensure agreementautomaticallyin the network, avoiding the need for any communication between replicas. Our evaluation shows thatCrossTalkimproves schedulability by 2.13–4.24× over the state of the art. Moreover, in a NASA simulation of a real spaceflight mission,CrossTalktolerates more faults than the state of the art while using nearly 3× less processor time.
more » « less
Full Text Available
PCSPOOF: Compromising the Safety of Time-Triggered Ethernet

https://doi.org/10.1109/SP46215.2023.10179318

Loveless, Andrew; Phan, Linh Thi; Dreslinski, Ronald; Kasikci, Baris (May 2023, 2023 IEEE Symposium on Security and Privacy (SP))

Designers are increasingly using mixed-criticality networks in embedded systems to reduce size, weight, power, and cost. Perhaps the most successful of these technologies is Time-Triggered Ethernet (TTE), which lets critical time-triggered (TT) traffic and non-critical best-effort (BE) traffic share the same switches and cabling. A key aspect of TTE is that the TT part of the system is isolated from the BE part, and thus BE devices have no way to disrupt the operation of the TTE devices. This isolation allows designers to: (1) use untrusted, but low cost, BE hardware, (2) lower BE security requirements, and (3) ignore BE devices during safety reviews and certification procedures.We present PCSPOOF, the first attack to break TTE’s isolation guarantees. PCSPOOF is based on two key observations. First, it is possible for a BE device to infer private information about the TT part of the network that can be used to craft malicious synchronization messages. Second, by injecting electrical noise into a TTE switch over an Ethernet cable, a BE device can trick the switch into sending these malicious synchronization messages to other TTE devices. Our evaluation shows that successful attacks are possible in seconds, and that each successful attack can cause TTE devices to lose synchronization for up to a second and drop tens of TT messages — both of which can result in the failure of critical systems like aircraft or automobiles. We also show that, in a simulated spaceflight mission, PCSPOOF causes uncontrolled maneuvers that threaten safety and mission success. We disclosed PCSPOOF to aerospace companies using TTE, and several are implementing mitigations from this paper.
more » « less
Full Text Available
Multi-mode on Multi-core: Making the best of both worlds with Omni

https://doi.org/10.1109/RTSS55097.2022.00020

Gifford, Robert; Phan, Linh Thi (December 2022, 2022 IEEE Real-Time Systems Symposium (RTSS))

When scheduling multi-mode real-time systems on multi-core platforms, a key question is how to dynamically adjust shared resources, such as cache and memory bandwidth, when resource demands change, without jeopardizing schedulability during mode changes. This paper presents Omni, a first end-to-end solution to this problem. Omni consists of a novel multi-mode resource allocation algorithm and a resource-aware schedulability test that supports general mode-change semantics as well as dynamic cache and bandwidth resource allocation. Omni's resource allocation leverages the platform's concurrency and the diversity of the tasks' demands to minimize overload during mode transitions; it does so by intelligently co-distributing tasks and resources across cores. Omni's schedulability test ensures predictable mode transitions, and it takes into account mode-change effects on the resource demands on different cores, so as to best match their dynamic needs using the available resources. We have implemented a prototype of Omni, and we have evaluated it using randomly generated multi-mode systems with several real-world benchmarks as the workload. Our results show that Omni has low overhead, and that it is substantially more effective in improving schedulability than the state of the art
more » « less
Full Text Available
DNA: Dynamic Resource Allocation for Soft Real-Time Multicore Systems

https://doi.org/10.1109/RTAS52030.2021.00024

Gifford, Robert; Gandhi, Neeraj; Phan, Linh Thi; Haeberlen, Andreas (May 2021, Proceedings of the 27th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS '21))
null (Ed.)
Modern latency-sensitive and real-time systems often use multi-core platforms; thus, tasks on different cores share certain hardware resources, such as the memory bus and certain cache levels. This has two undesirable consequences: (1) tasks can interfere with each other, causing high latency for the system as a whole, and (2) it becomes difficult to meet deadlines, since the worst-case timing of a given task depends on the worst task it might have to compete with. Static partitioning isolates tasks from each other by allocating a certain fraction of the resources to each; however, many tasks execute in different phases (e.g., memory-intensive and CPU-intensive) that have different requirements. Thus, system designers are left with a choice between overprovisioning, based on the most demanding phase, or suboptimal performance. In this paper, we propose a pair of techniques, called DNA and DADNA, to address the above challenge. DNA increases throughput and decreases latency, by building an execution profile of each task to identify the phases, and then dynamically allocating resources based on which task can benefit the most; DADNA further adds support for soft real-time workloads by taking deadlines into account. We have built a prototype of both techniques in the Xen hypervisor; our experimental results show that, compared to a state-of-the-art solution, DNA and DADNA can substantially improve schedulability, reduce job deadline miss ratios, and cut latencies by more than a factor of two even in extremely overloaded situations.
more » « less
Full Text Available
When Idling is Ideal: Optimizing Tail-Latency for Heavy-Tailed Datacenter Workloads with Perséphone

https://doi.org/10.1145/3477132.3483571

Demoulin, Henri Maxime; Fried, Joshua; Pedisich, Isaac; Kogias, Marios; Loo, Boon Thau; Phan, Linh Thi; Zhang, Irene (October 2021, Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP))

Full Text Available
REBOUND: Defending Distributed Systems Against Attacks with Bounded-Time Recovery

https://doi.org/10.1145/3447786.3456257

Gandhi, Neeraj; Roth, Edo; Sandler, Brian; Haeberlen, Andreas; Phan, Linh Thi (April 2021, Proceedings of the 16th European Conference on Computer Systems (EuroSys'21))
null (Ed.)
This paper shows how to use bounded-time recovery (BTR) to defend distributed systems against non-crash faults and attacks. Unlike many existing fault-tolerance techniques, BTR does not attempt to completely mask all symptoms of a fault; instead, it ensures that the system returns to the correct behavior within a bounded amount of time. This weaker guarantee is sufficient, e.g., for many cyber-physical systems, where physical properties - such as inertia and thermal capacity - prevent quick state changes and thus limit the damage that can result from a brief period of undefined behavior. We present an algorithm called REBOUND that can provide BTR for the Byzantine fault model. REBOUND works by detecting faults and then reconfiguring the system to exclude the faulty nodes. This supports very fine-grained responses to faults: for instance, the system can move or replace existing tasks, or drop less critical tasks entirely to conserve resources. REBOUND can take useful actions even when a majority of the nodes is compromised, and it requires less redundancy than full fault-tolerance.
more » « less
Full Text Available
Self-Reconfiguration in Response to Faults in Modular Aerial Systems

https://doi.org/10.1109/LRA.2020.2970685

Gandhi, Neeraj; Saldana, David; Kumar, Vijay; Phan, Linh Thi (April 2020, IEEE Robotics and Automation Letters)

Full Text Available
Holistic multi-resource allocation for multicore real-time virtualization

https://doi.org/10.1145/3316781.3317840

Xu, Meng; Gifford, Robert; Phan, Linh Thi (January 2019, 56th Annual Design Automation Conference)

Full Text Available
Zeno: Diagnosing Performance Problems with Temporal Provenance

Wu, Yang; Chen, Ang; Phan, Linh Thi (January 2019, USENIX NSDI)

Full Text Available

« Prev Next »

Search for: All records