Disco: A Compact Index for LSM-trees

Zhong, Wenshao; Chen, Chen; Wu, Xingbo; Eriksson, Jakob

doi:10.1145/3709683

Citation Details

This content will become publicly available on February 10, 2026

Disco: A Compact Index for LSM-trees

Many key-value stores and database systems use log-structured merge-trees (LSM-trees) as their storage engines because of their excellent write performance. However, the read performance of LSM-trees is suboptimal due to the overlapping sorted runs. Most existing efforts rely on filters to reduce unnecessary I/Os, but filters fundamentally do not help locate items and often become the bottleneck of the system. We identify that the lack of efficient index is the root cause of subpar read performance in LSM-trees. In this paper, we propose Disco: a compact index for LSM-trees. Disco indexes all the keys in an LSM-tree, so a query does not have to search every run of the LSM-tree. It records compact key representations to minimize the number of key comparisons so as to minimize cache misses and I/Os for both point and range queries. Disco guarantees that both point queries and seeks issue at most one I/O to the underlying runs, achieving an I/O efficiency close to a B⁺-tree. Disco improves upon REMIX's pioneering multi-run index design with additional compact key representations to help improve read performance. The representations are compact so the cost of persisting Disco to disk is small. Moreover, while a traditional LSM-tree has to choose a more aggressive compaction policy that slows down write performance to have better read performance, a Disco-indexed LSM-tree can employ a write-efficient policy and still have good read performance. Experimental results show that Disco can save I/Os and improve point and range query performance by up to 220% over RocksDB while maintaining efficient writes. more »

Award ID(s):: 2114218

PAR ID:: 10573375

Author(s) / Creator(s):: Zhong, Wenshao; Chen, Chen; Wu, Xingbo; Eriksson, Jakob

Publisher / Repository:: ACM Digital Library

Date Published:: 2025-02-10

Journal Name:: Proceedings of the ACM on Management of Data

Volume:: 3

Issue:: 1

ISSN:: 2836-6573

Page Range / eLocation ID:: 1 to 27

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on February 10, 2026
Journal Article:
https://doi.org/10.1145/3709683

More Like this