LDB: An Efficient Latency Profiling Tool for Multithreaded Applications

Cho, Inho; Park, Seo Jin; Saeed, Ahmed; Alizadeh, Mohammad; Belay, Adam

Citation Details

Maintaining low tail latency is critical for the efficiency and performance of large-scale datacenter systems. Software bugs that cause tail latency problems, however, are notoriously difficult to debug. We present LDB, a new latency profiling tool that aims to overcome this challenge by precisely identifying the specific functions that are responsible for tail latency anomalies. LDB observes the latency of all functions in a running program. It uses a novel, software-only technique called stack sampling, where a busy-spinning stack scanner thread polls lightweight metadata recorded in the call stack, shifting tracing costs away from program threads. In addition, LDB uses event tagging to record requests, inter-thread synchronization, and context switching. This can be used, for example, to generate per-request timelines and to find the root cause of complex tail latency problems such as lock contention in multi-threaded programs. We evaluate LDB with three datacenter applications, finding latency problems in each. Our results further show that LDB produces actionable insights, has low overhead, and can rapidly analyze recordings, making it feasible to use in production settings. more »

Award ID(s):: 2104398 2212099

PAR ID:: 10506278

Author(s) / Creator(s):: Cho, Inho; Park, Seo Jin; Saeed, Ahmed; Alizadeh, Mohammad; Belay, Adam

Publisher / Repository:: USENIX

Date Published:: 2024-04-16

Journal Name:: 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI'24)

ISBN:: 978-1-939133-39-7

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this