NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Lightweight and Locality-Aware Composition of Black-Box Subroutines

https://doi.org/10.1145/3729292

Bansal, Manya; Sharlet, Dillon; Ragan-Kelley, Jonathan; Amarasinghe, Saman (June 2025, Proceedings of the ACM on Programming Languages)

Subroutines are essential building blocks in software design: users encapsulate common functionality in libraries and write applications by composing calls to subroutines. Unfortunately, performance may be lost at subroutine boundaries due to reduced locality and increased memory consumption. Operator fusion helps recover the performance lost at composition boundaries. Previous solutions fuse operators by manually rewriting code into monolithic fused subroutines, or by relying on heavy-weight compilers to generate code that performs fusion. Both approaches require a semantic understanding of the entire computation, breaking the decoupling necessary for modularity and reusability of subroutines. In this work, we attempt to identify the minimal ingredients required to fuse computations, enabling composition of subroutines without sacrificing performance or modularity. We find that, unlike previous approaches that require a semantic understanding of the computation, most opportunities for fusion require understanding only data production and consumption patterns.Exploiting this insight, we add fusion on top of black-box subroutines by proposing a lightweight enrichment of subroutine declarations to expose data-dependence patterns. We implement our approach in a system called Fern, and demonstrate Fern’s benefits by showing that it is competitive with state-of-the-art, high-performance libraries with manually fused operators, can fuse across library and domain boundaries for unforeseen workloads, and can deliver speedups of up to 5× over unfused code.
more » « less
Free, publicly-accessible full text available June 10, 2026
Finch: Sparse and Structured Tensor Programming with Control Flow

https://doi.org/10.1145/3720473

Ahrens, Willow; Collin, Teodoro Fields; Patel, Radha; Deeds, Kyle; Hong, Changwan; Amarasinghe, Saman (April 2025, Proceedings of the ACM on Programming Languages)

From FORTRAN to NumPy, tensors have revolutionized how we express computation. However, tensors in these, and almost all prominent systems, can only handle dense rectilinear integer grids. Real world tensors often contain underlying structure, such as sparsity, runs of repeated values, or symmetry. Support for structured data is fragmented and incomplete. Existing frameworks limit the tensor structures and program control flow they support to better simplify the problem. In this work, we propose a new programming language, Finch, which supports both flexible control flow and diverse data structures. Finch facilitates a programming model which resolves the challenges of computing over structured tensors by combining control flow and data structures into a common representation where they can be co-optimized. Finch automatically specializes control flow to data so that performance engineers can focus on experimenting with many algorithms. Finch supports a familiar programming language of loops, statements, ifs, breaks, etc., over a wide variety of tensor structures, such as sparsity, run-length-encoding, symmetry, triangles, padding, or blocks. Finch reliably utilizes the key properties of structure, such as structural zeros, repeated values, or clustered non-zeros. We show that this leads to dramatic speedups in operations such as SpMV and SpGEMM, image processing, and graph analytics.
more » « less
Free, publicly-accessible full text available April 9, 2026
Exo 2: Growing a Scheduling Language

https://doi.org/10.1145/3669940.3707218

Ikarashi, Yuka; Qian, Kevin; Droubi, Samir; Reinking, Alex; Bernstein, Gilbert Louis; Ragan-Kelley, Jonathan (March 2025, ACM)

Free, publicly-accessible full text available March 30, 2026
SySTeC: A Symmetric Sparse Tensor Compiler

https://doi.org/10.1145/3696443.3708919

Patel, Radha; Ahrens, Willow; Amarasinghe, Saman (March 2025, ACM)

Free, publicly-accessible full text available March 1, 2026

Search for: All records