NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

RoboRebound: Multi-Robot System Defense with Bounded-Time Interaction

https://doi.org/10.1145/3689031.3696079

Gandhi, Neeraj; Cai, Yifan; Haeberlen, Andreas; Phan, Linh_Thi Xuan (March 2025, ACM)

Byzantine Fault Tolerance (BFT) is a classic technique for defending distributed systems against a wide range of faults and attacks. However, existing solutions are designed for systems where nodes can interact only by exchanging messages. They are not directly applicable to systems where nodes have sensors and actuators and can also interact in the physical world – perhaps by blocking each other’s path or by crashing into each other. In this paper, we take a first stab at extending BFT to this larger class of systems. We focus on multi-robot systems (MRS), an emerging technology that is increasingly being deployed for applications such as target tracking, warehouse logistics, and exploration. An MRS can consist of dozens of interacting robots and is thus a bona-fide distributed system. The classic masking guarantee is not practical in a MRS, but we propose a variant called bounded-time interaction that can be implemented, and we present an algorithm that achieves it, in combination with a few small hardware tweaks. We built a simulator and prototyped wheeled robots to show that our algorithm is effective, and that it has a reasonable overhead.
more » « less
Free, publicly-accessible full text available March 30, 2026
Rotor Fault Detection and Isolation in Aerial Vehicles with Dozens of Rotors

https://doi.org/10.1109/LARS64411.2024.10786448

Gandhi, Neeraj; Xu, Jiawei; Saldaña, David; Phan, Linh_Thi Xuan (November 2024, IEEE)

Aerial vehicles with dozens of rotors are becoming increasingly common in important applications such as transportation and construction. One challenge with building such a system is to ensure that the system is robust against faults: as the number of rotors increases, the likelihood of a rotor failing during operation also increases; despite the spare thrust capacity provided by the redundant rotors, a rotor fault can significantly impact the motion and safety of the system. This paper presents an efficient fault detection and isolation (FDI) method for aerial vehicles with a large number of rotors. Our approach relies on two key insights: First, the effect of a faulty rotor directly affects the tracking error in roll and in pitch. This property can be used to order our faulty rotor search space. Second, the error in either roll or pitch is related to both the distance from the (relevant) axis and the severity of a fault. With these observations, we can use probe faults to isolate faulty rotors. Evaluation results show that our technique can efficiently detect and isolate faults in multi-rotor aerial vehicles with up to 64 rotors (8 more rotors than in existing FDI work), and that it can help improve robustness. To the best of our knowledge, our FDI method is the first that scales to several dozens of rotors.
more » « less
Free, publicly-accessible full text available November 11, 2025
Analysis of Long-term Average Behaviors of Probabilistic Task Systems

https://doi.org/10.1145/3696355.3696365

Cai, Yifan; Phan, Linh_Thi Xuan; Thiagarajan, PS (November 2024, ACM)

Free, publicly-accessible full text available November 6, 2025
Online Rotor Fault Detection and Isolation for Vertical Takeoff and Landing Vehicles

https://doi.org/10.1109/IROS58592.2024.10802021

Lian, Jiaqi; Gandhi, Neeraj; Wang, Yifan; Xuan_Phan, Linh Thi (October 2024, IEEE)

Vertical take-off and landing (VTOL) vehicles are becoming increasingly popular for real-world transport; but, as with any vehicle, guaranteeing safety is both extremely critical and highly challenging due to issues like rotor faults. Existing fault detection and isolation (FDI) techniques usually focus on multirotor systems or fixed wing systems, rather than the hybrid VTOLs. Since VTOLs have both rotors and ailerons, a fault in a rotor may be masked by the (correctly working) ailerons, making it much more difficult to detect faults. However, this masking only works when ailersons are used (e.g., during cruising), leaving the takeoff and landing vulnerable to crashes. This paper presents an online rotor fault detection and isolation (FDI) method for VTOLs. The approach uses pose analysis and aileron command data to quickly and accurately identify the faulty rotor and to compute the severity of the fault. Our method works for hard-to-detect fault scenarios, such as small-severity faults that are masked during cruise flight but not during vertical motion. We evaluated our technique in a SITL PX4 simulation of a modified Deltaquad QuadPlane. The results show that our FDI technique can quickly detect and isolate faults in real time (within 1s-2.5s) and achieve high isolation success rate (91.67%) across six rotors, and that it can estimate the severity of faults to within 2%. When applying a simple recovery process post-isolation, the system consistently achieved safe landing.
more » « less
Full Text Available
Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory Architectures

https://doi.org/10.1145/3654958

Sha, Mo; Cai, Yifan; Wang, Sheng; Phan, Linh_Thi Xuan; Li, Feifei; Tan, Kian-Lee (May 2024, Proceedings of the ACM on Management of Data)

In contemporary database applications, the demand for memory resources is intensively high. To enhance adaptability to varying resource needs and improve cost efficiency, the integration of diverse storage technologies within heterogeneous memory architectures emerges as a promising solution. Despite the potential advantages, there exists a significant gap in research related to the security of data within these complex systems. This paper endeavors to fill this void by exploring the intricacies and challenges of ensuring data security in object-oriented heterogeneous memory systems. We introduce the concept of Unified Encrypted Memory (UEM) management, a novel approach that provides unified object references essential for data management platforms, while simultaneously concealing the complexities of physical scheduling from developers. At the heart of UEM lies the seamless and efficient integration of data encryption techniques, which are designed to ensure data integrity and guarantee the freshness of data upon access. Our research meticulously examines the security deficiencies present in existing heterogeneous memory system designs. By advancing centralized security enforcement strategies, we aim to achieve efficient object-centric data protection. Through extensive evaluations conducted across a variety of memory configurations and tasks, our findings highlight the effectiveness of UEM. The security features of UEM introduce low and acceptable overheads, and UEM outperforms conventional security measures in terms of speed and space efficiency.
more » « less
Full Text Available
Decntr: Optimizing Safety and Schedulability with Multi-Mode Control and Resource Allocation Co-Design

https://doi.org/10.1109/RTAS61025.2024.00032

Gifford, Robert; Galarza-Jimenez, Felipe; Xuan_Phan, Linh Thi; Zamani, Majid (May 2024, IEEE)

Full Text Available
CrossTalk: Making Low-Latency Fault Tolerance Cheap by Exploiting Redundant Networks

https://doi.org/10.1145/3609436

Loveless, Andrew; Phan, Linh Thi; Erickson, Lisa; Dreslinski, Ronald; Kasikci, Baris (October 2023, ACM Transactions on Embedded Computing Systems)

Real-time embedded systems perform many important functions in the modern world. A standard way to tolerate faults in these systems is with Byzantine fault-tolerant (BFT) state machine replication (SMR), in which multiple replicas execute the same software and their outputs are compared by the actuators. Unfortunately, traditional BFT SMR protocols areslow, requiring replicas to exchange sensor data back and forth over multiple rounds in order to reach agreement before each execution. The state of the art in reducing the latency of BFT SMR iseager execution, in which replicas execute on data from different sensors simultaneously on different processor cores. However, this technique results in 3–5× higher computation overheads compared to traditional BFT SMR systems, significantly limiting schedulability. We presentCrossTalk, a new BFT SMR protocol that leverages the prevalence of redundant switched networks in embedded systems to reduce latency without added computation. The key idea is to use specific algorithms to move messages between redundant network planes (which many systems already possess) as the messages travel from the sensors to the replicas. As a result,CrossTalkcan ensure agreementautomaticallyin the network, avoiding the need for any communication between replicas. Our evaluation shows thatCrossTalkimproves schedulability by 2.13–4.24× over the state of the art. Moreover, in a NASA simulation of a real spaceflight mission,CrossTalktolerates more faults than the state of the art while using nearly 3× less processor time.
more » « less
Full Text Available
Arboretum: A Planner for Large-Scale Federated Analytics with Differential Privacy

https://doi.org/10.1145/3600006.3624566

Margolin, Elizabeth; Newatia, Karan; Luo, Tao; Roth, Edo; Haeberlen, Andreas (October 2023, ACM Symposium on Operating Systems Principles (SOSP '23))

Full Text Available
PCSPOOF: Compromising the Safety of Time-Triggered Ethernet

https://doi.org/10.1109/SP46215.2023.10179318

Loveless, Andrew; Phan, Linh Thi; Dreslinski, Ronald; Kasikci, Baris (May 2023, 2023 IEEE Symposium on Security and Privacy (SP))

Designers are increasingly using mixed-criticality networks in embedded systems to reduce size, weight, power, and cost. Perhaps the most successful of these technologies is Time-Triggered Ethernet (TTE), which lets critical time-triggered (TT) traffic and non-critical best-effort (BE) traffic share the same switches and cabling. A key aspect of TTE is that the TT part of the system is isolated from the BE part, and thus BE devices have no way to disrupt the operation of the TTE devices. This isolation allows designers to: (1) use untrusted, but low cost, BE hardware, (2) lower BE security requirements, and (3) ignore BE devices during safety reviews and certification procedures.We present PCSPOOF, the first attack to break TTE’s isolation guarantees. PCSPOOF is based on two key observations. First, it is possible for a BE device to infer private information about the TT part of the network that can be used to craft malicious synchronization messages. Second, by injecting electrical noise into a TTE switch over an Ethernet cable, a BE device can trick the switch into sending these malicious synchronization messages to other TTE devices. Our evaluation shows that successful attacks are possible in seconds, and that each successful attack can cause TTE devices to lose synchronization for up to a second and drop tens of TT messages — both of which can result in the failure of critical systems like aircraft or automobiles. We also show that, in a simulated spaceflight mission, PCSPOOF causes uncontrolled maneuvers that threaten safety and mission success. We disclosed PCSPOOF to aerospace companies using TTE, and several are implementing mitigations from this paper.
more » « less
Full Text Available
Multi-mode on Multi-core: Making the best of both worlds with Omni

https://doi.org/10.1109/RTSS55097.2022.00020

Gifford, Robert; Phan, Linh Thi (December 2022, 2022 IEEE Real-Time Systems Symposium (RTSS))

When scheduling multi-mode real-time systems on multi-core platforms, a key question is how to dynamically adjust shared resources, such as cache and memory bandwidth, when resource demands change, without jeopardizing schedulability during mode changes. This paper presents Omni, a first end-to-end solution to this problem. Omni consists of a novel multi-mode resource allocation algorithm and a resource-aware schedulability test that supports general mode-change semantics as well as dynamic cache and bandwidth resource allocation. Omni's resource allocation leverages the platform's concurrency and the diversity of the tasks' demands to minimize overload during mode transitions; it does so by intelligently co-distributing tasks and resources across cores. Omni's schedulability test ensures predictable mode transitions, and it takes into account mode-change effects on the resource demands on different cores, so as to best match their dynamic needs using the available resources. We have implemented a prototype of Omni, and we have evaluated it using randomly generated multi-mode systems with several real-world benchmarks as the workload. Our results show that Omni has low overhead, and that it is substantially more effective in improving schedulability than the state of the art
more » « less
Full Text Available

« Prev Next »

Search for: All records