Deriving semantic checkers from tests to detect silent failures in production distributed systems

Lou, Chang; Parikesit, Dimas Shidqi; Huang, Yujin; Yang, Zhewen; Diwangkara, Senapati; Jing, Yuzhuo; Kistijantoro, Achmad Imam; Yuan, Ding; Nath, Suman; Huang, Peng

Citation Details

This content will become publicly available on July 7, 2026

Deriving semantic checkers from tests to detect silent failures in production distributed systems

Production distributed systems provide rich features, but various defects can cause a system to silently violate its semantics without explicit errors. Such failures cause serious consequences. Yet, they are extremely challenging to detect, as it requires deep domain knowledge and substantial manual efforts to write good checkers. In this paper, we explore a novel approach that directly derives semantic checkers from system test code. We first present a large-scale study on existing system test cases. Guided by the study findings, we develop T2C, a framework that uses static and dynamic analysis to transform and generalize a test into a runtime checker. We apply T2C on four large, popular distributed systems and successfully derive tens to hundreds of checkers. These checkers detect 15 out of 20 real-world silent failures we reproduce and incur small runtime overhead. more »

Award ID(s):: 2441284

PAR ID:: 10659227

Author(s) / Creator(s):: Lou, Chang; Parikesit, Dimas Shidqi; Huang, Yujin; Yang, Zhewen; Diwangkara, Senapati; Jing, Yuzhuo; Kistijantoro, Achmad Imam; Yuan, Ding; Nath, Suman; Huang, Peng

Publisher / Repository:: 19th USENIX Symposium on Operating Systems Design and Implementation

Date Published:: 2025-07-07

ISBN:: 978-1-939133-47-2

Format(s):: Medium: X

Location:: Boston, MA, USA

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on July 7, 2026
Conference Paper:
The DOI is not currently available.

More Like this