Unlocking the Unusable: A Proactive Caching Framework for Reusing Partial Overlapped Data

Guo, Chang; Podhorszki, Norbert; Eisenhauer, Greg; Xie, Zhiwen; Klasky, Scott; Cao, Zhichao

doi:10.1145/3736548.3737839

Citation Details

Unlocking the Unusable: A Proactive Caching Framework for Reusing Partial Overlapped Data

Cache systems are widely used to speed up data retrieving. Modern HPC, data analytics, and AI/ML workloads generate vast, multi-dimensional datasets, and those data are accessed via complex queries. However, the probability of requesting the exact same data across different queries is low, leading to limited performance improvement when a traditional key-value cache is applied. In this paper, we present Mosaic-Cache, a proactive and general caching framework that enables applications with efficient partial overlapped data reuse through novel overlap-aware cache interfaces for fast content-level reuse. The core components include a metadata manager leveraging customizable indexing for fast overlap lookups, an adaptive fetch planner for dynamic cache-to-storage decisions, and an async merger to reduce cache fragmentation and redundancy. Evaluations on real-world HPC datasets show that Mosaic-Cache improves overall performance by up to 4.1× over traditional key-value-based cache while adding minimal overhead in worst-case scenarios. more »

Award ID(s):: 2412436 2443219

PAR ID:: 10612121

Author(s) / Creator(s):: Guo, Chang; Podhorszki, Norbert; Eisenhauer, Greg; Xie, Zhiwen; Klasky, Scott; Cao, Zhichao

Publisher / Repository:: ACM

Date Published:: 2025-07-10

Page Range / eLocation ID:: 129 to 136

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1145/3736548.3737839

More Like this