Efficient Exposure of Partial Failure Bugs in Distributed Systems with Inferred Abstract States

Wu, Haoze; Pan, Jia; Huang, Peng

Citation Details

Many distributed system failures, especially the notorious partial service failures, are caused by bugs that are only triggered by subtle faults at rare timing. Existing testing is inefficient in exposing such bugs. This paper presents Legolas, a fault injection testing framework designed to address this gap. To precisely simulate subtle faults, Legolas statically analyzes the system code and instruments hooks within a system. To efficiently explore numerous faults, Legolas introduces a novel notion of abstract states and automatically infers abstract states from code. During testing, Legolas designs an algorithm that leverages the inferred abstract states to make careful fault injection decisions. We applied Legolas on the latest releases of six popular, extensively tested distributed systems. Legolas found 20 new bugs that result in partial service failures. more »

Award ID(s):: 2317698 2317751

PAR ID:: 10546455

Author(s) / Creator(s):: Wu, Haoze; Pan, Jia; Huang, Peng

Publisher / Repository:: 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)

Date Published:: 2024-04-16

ISBN:: 978-1-939133-39-7

Format(s):: Medium: X

Location:: Santa Clara, CA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this