NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cache Programming for Scientific Loops Using Leases

Reber, Benjamin; Gould, Matthew; Kneipp, Alexander H.; Liu, Fangzhou; Prechtl, Ian; Ding, Chen; Chen, Linlin; Patru, Dorin (July 2023, ACM transactions on architecture and code optimization)

Cache management is important in exploiting locality and reducing data movement. This article studies a new type of programmable cache called the lease cache. By assigning leases, software exerts the primary control on when and how long data stays in the cache. Previous work has shown an optimal solution for an ideal lease cache. This article develops and evaluates a set of practical solutions for a physical lease cache emulated in FPGA with the full suite of PolyBench benchmarks. Compared to automatic caching, lease programming can further reduce data movement by 10% to over 60% when the data size is 16 times to 3,000 times the cache size, and the techniques in this article realize over 80% of this potential. Moreover, lease programming can reduce data movement by another 0.8% to 20% after polyhedral locality optimization.
more » « less
Full Text Available
Blast from the Past: Least Expected Use (LEU) Cache Replacement with Statistical History

https://doi.org/10.1145/3591195.3595267

Chakraborti, Sayak; Zhang, Zhizhou; Bertram, Noah; Ding, Chen; Dwarkadas, Sandhya (June 2023, ACM SIGPLAN International Symposium on Memory Management)
Erek Petrank and Steve Blackburn (Ed.)
Cache replacement policies typically use some form of statistics on past access behavior. As a common limitation, how- ever, the extent of the history being recorded is limited to either just the data in cache or, more recently, a larger but still finite-length window of accesses, because the cost of keeping a long history can easily outweigh its benefit. This paper presents a statistical method to keep track of instruction pointer-based access reuse intervals of arbitrary length and uses this information to identify the Least Ex- pected Use (LEU) blocks for replacement. LEU uses dynamic sampling supported by novel hardware that maintains a state to record arbitrarily long reuse intervals. LEU is evaluated using the Cache Replacement Championship simulator, tested on PolyBench and SPEC, and compared with five policies including a recent technique that approximates optimal caching using a fixed-length history. By maintaining statistics for an arbitrary history, LEU outperforms previous techniques for a broad range of scientific kernels, whose data reuses are longer than those in traces traditionally used in computer architecture studies.
more » « less
Full Text Available
Beyond time complexity: data movement complexity analysis for matrix multiplication

https://doi.org/10.1145/3524059.3532395

Smith, Wesley; Goldfarb, Aidan; Ding, Chen (June 2022, 2022 International Conference on Supercomputing (ICS ’22))

Full Text Available
Cache-coherent CLAM (WIP)

https://doi.org/10.1145/3519941.3535073

Ding, Chen; Reber, Benjamin; Patru, Dorin (June 2022, Cache-Coherent CLAM (WIP))

Full Text Available
CARL: Compiler Assigned Reference Leasing

https://doi.org/10.1145/3498730

Ding, Chen; Chen, Dong; Liu, Fangzhou; Reber, Benjamin; Smith, Wesley (March 2022, ACM Transactions on Architecture and Code Optimization)

Data movement is a common performance bottleneck, and its chief remedy is caching. Traditional cache management is transparent to the workload: data that should be kept in cache are determined by the recency information only, while the program information, i.e., future data reuses, is not communicated to the cache. This has changed in a new cache design named Lease Cache . The program control is passed to the lease cache by a compiler technique called Compiler Assigned Reference Lease (CARL). This technique collects the reuse interval distribution for each reference and uses it to compute and assign the lease value to each reference. In this article, we prove that CARL is optimal under certain statistical assumptions. Based on this optimality, we prove miss curve convexity, which is useful for optimizing shared cache, and sub-partitioning monotonicity, which simplifies lease compilation. We evaluate the potential using scientific kernels from PolyBench and show that compiler insertions of up to 34 leases in program code achieve similar or better cache utilization (in variable size cache) than the optimal fixed-size caching policy, which has been unattainable with automatic caching but now within the potential of cache programming for all tested programs and most cache sizes.
more » « less
Full Text Available
Writeback Modeling: Theory and Application to Zipfian Workloads

https://doi.org/10.1145/3488423.3519331

Smith, Wesley; Byrne, Daniel; Ding, Chen (September 2021, he International Symposium on Memory Systems (MEMSYS 2021))

Full Text Available
Uniform lease vs. LRU cache: analysis and evaluation

https://doi.org/10.1145/3459898.3463908

Chen, Dong; Ding, Chen; Liu, Fangzhou; Reber, Benjamin; Smith, Wesley; Li, Pengcheng (June 2021, International Symposium on Memory Management (ISMM ’21))

Full Text Available
AWLCO: All-Window Length Co-Occurrence

https://doi.org/10.4230/LIPIcs.CPM.2021.24

Sobel, Joshua; Bertram, Noah; Ding, Chen; Nargesian, Fatemeh; Gildea, Daniel (January 2021, 32nd Annual Symposium on Combinatorial Pattern Matching (CPM 2021))

Full Text Available
CLAM: Compiler Lease of Cache Memory

https://doi.org/10.1145/3422575.3422800

Prechtl, Ian; Reber, Ben; Ding, Chen; Patru, Dorin; Chen, Dong (September 2020, International Symposium on Memory Systems)
null (Ed.)
Traditional caching is transparent to software but cannot utilize program information directly. With Moore’s Law ending and general-purpose processor speed plateauing, there is in- creasing importance and interest in specialization including the interaction between the software and the cache. This paper presents Compiler Lease of cAche Memory (CLAM) which augments the interface between software and hardware and lets a compiler control cache management. The new software control enables optimization beyond what is possible in traditional memory system designs. CLAM has been implemented on a CycloneV-GT FPGA card with a RISC-V processor and the new hardware cache, and the evaluation has shown performance improvements over existing techniques in all of the 7 programs tested from the Polybench suite.
more » « less
Full Text Available
Statistical caching for near memory management

https://doi.org/10.1145/3357526.3357557

Chen, Dong; Liu, Fangzhou; Jiao, Mingyang; Ding, Chen; Pai, Sreepathi (September 2019, Proceedings of the International Symposium on Memory Systems)

Modern GPUs often use near memory or high-bandwidth memory, which may be managed as cache when the application data is too large to fit in the near memory. Unlike CPU caches, the near memory cache has a much larger size. A recent approach is statistical caching, which shows near optimal results when managing large memory for file caching. The prior work is ideal and not practical. This paper outlines two extensions. It first formulates a new caching algorithm called least expected use (LEU) replacement and shows, through examples, that the statistical solution automatically integrates two otherwise disparate policies. Then the paper describes a system design to implement LEU. To position the new design for discussion, the paper draws parallels with two familiar ideas, branch prediction and spectral analysis, and considers a set of opportunities and challenges of achieving statistical caching in near memory.
more » « less
Full Text Available

« Prev Next »

Search for: All records