Reducing Load Latency with Cache Level Prediction

Jalili, Majid; Erez, Mattan

doi:10.1109/HPCA53966.2022.00054

Citation Details

Reducing Load Latency with Cache Level Prediction

High load latency that results from deep cache hierarchies and relatively slow main memory is an important limiter of single-thread performance. Data prefetch helps reduce this latency by fetching data up the hierarchy before it is requested by load instructions. However, data prefetching has shown to be imperfect in many situations. We propose cache-level prediction to complement prefetchers. Our method predicts which memory hierarchy level a load will access allowing the memory loads to start earlier, and thereby saves many cycles. The predictor provides high prediction accuracy at the cost of just one cycle added latency to L1 misses. Level prediction reduces the memory access latency by 20% on average, and provides speedup of 10.3% over a conventional baseline, and 6.1% over a boosted baseline on generic, graph, and HPC applications. more »

Award ID(s):: 1719061

PAR ID:: 10340193

Author(s) / Creator(s):: Jalili, Majid; Erez, Mattan

Date Published:: 2022-04-01

Journal Name:: Proceedings of the 2022 IEEE International Symposium on High Performance Computer Architecture (HPCA)

Page Range / eLocation ID:: 648 to 661

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/HPCA53966.2022.00054

More Like this