NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Galley: Modern Query Optimization for Sparse Tensor Programs

https://doi.org/10.1145/3725301

Deeds, Kyle; Ahrens, Willow; Balazinska, Magdalena; Suciu, Dan (June 2025, Proceedings of the ACM on Management of Data)

The tensor programming abstraction is a foundational paradigm which allows users to write high performance programs via a high-level imperative interface. Recent work onsparse tensor compilershas extended this paradigm to sparse tensors (i.e., tensors where most entries are not explicitly represented). With these systems, users define the semantics of the program and the algorithmic decisions in a concise language that can be compiled to efficient low-level code. However, these systems still require users to make complex decisions about program structure and memory layouts to write efficient programs. This work presents.Galley, a system for declarative tensor programming that allows users to write efficient tensor programs without making complex algorithmic decisions. Galley is the first system to perform cost based lowering of sparse tensor algebra to the imperative language of sparse tensor compilers, and the first to optimize arbitrary operators beyond Σ and *. First, it decomposes the input program into a sequence of aggregation steps through a novel extension of the FAQ framework. Second, Galley optimizes and converts each aggregation step to a concrete program, which is compiled and executed with a sparse tensor compiler. We show that Galley produces programs that are 1-300x faster than competing methods for machine learning over joins and 5-20x faster than a state-of-the-art relational database for subgraph counting workloads with a minimal optimization overhead.
more » « less
Free, publicly-accessible full text available June 17, 2026
Finch: Sparse and Structured Tensor Programming with Control Flow

https://doi.org/10.1145/3720473

Ahrens, Willow; Collin, Teodoro Fields; Patel, Radha; Deeds, Kyle; Hong, Changwan; Amarasinghe, Saman (April 2025, Proceedings of the ACM on Programming Languages)

From FORTRAN to NumPy, tensors have revolutionized how we express computation. However, tensors in these, and almost all prominent systems, can only handle dense rectilinear integer grids. Real world tensors often contain underlying structure, such as sparsity, runs of repeated values, or symmetry. Support for structured data is fragmented and incomplete. Existing frameworks limit the tensor structures and program control flow they support to better simplify the problem. In this work, we propose a new programming language, Finch, which supports both flexible control flow and diverse data structures. Finch facilitates a programming model which resolves the challenges of computing over structured tensors by combining control flow and data structures into a common representation where they can be co-optimized. Finch automatically specializes control flow to data so that performance engineers can focus on experimenting with many algorithms. Finch supports a familiar programming language of loops, statements, ifs, breaks, etc., over a wide variety of tensor structures, such as sparsity, run-length-encoding, symmetry, triangles, padding, or blocks. Finch reliably utilizes the key properties of structure, such as structural zeros, repeated values, or clustered non-zeros. We show that this leads to dramatic speedups in operations such as SpMV and SpGEMM, image processing, and graph analytics.
more » « less
Free, publicly-accessible full text available April 9, 2026
Partition Constraints for Conjunctive Queries: Bounds and Worst-Case Optimal Joins

https://doi.org/10.4230/LIPICS.ICDT.2025.17

Deeds, Kyle; Merkl, Timo Camillo (January 2025, Schloss Dagstuhl – Leibniz-Zentrum für Informatik)
Roy, Sudeepa; Kara, Ahmet (Ed.)
In the last decade, various works have used statistics on relations to improve both the theory and practice of conjunctive query execution. Starting with the AGM bound which took advantage of relation sizes, later works incorporated statistics like functional dependencies and degree constraints. Each new statistic prompted work along two lines; bounding the size of conjunctive query outputs and worst-case optimal join algorithms. In this work, we continue in this vein by introducing a new statistic called a partition constraint. This statistic captures latent structure within relations by partitioning them into sub-relations which each have much tighter degree constraints. We show that this approach can both refine existing cardinality bounds and improve existing worst-case optimal join algorithms.
more » « less
Free, publicly-accessible full text available January 1, 2026
Pessimistic Cardinality Estimation

https://doi.org/10.1145/3712311.3712313

Abo_Khamis, Mahmoud; Deeds, Kyle; Olteanu, Dan; Suciu, Dan (January 2025, ACM SIGMOD Record)

Cardinality Estimation is to estimate the size of the output of a query without computing it, by using only statistics on the input relations. Existing estimators try to return an unbiased estimate of the cardinality: this is notoriously difficult. A new class of estimators have been proposed recently, called pessimistic estimators, which compute a guaranteed upper bound on the query output. Two recent advances have made pessimistic estimators practical. The first is the recent observation that degree sequences of the input relations can be used to compute query upper bounds. The second is a long line of theoretical results that have developed the use of information theoretic inequalities for query upper bounds. This paper is a short overview of pessimistic cardinality estimators, contrasting them with traditional estimators.
more » « less
Free, publicly-accessible full text available January 13, 2026
Degree Sequence Bound for Join Cardinality Estimation

Deeds, Kyle Suciu (April 2023, 26th International Conference on Database Theory (ICDT))

Full Text Available
Degree Sequence Bound for Join Cardinality Estimation

Deeds, Kyle; Suciu, Dan; Balazinska, Magda; Cai, Walter (March 2023, 26th International Conference on Database Theory (ICDT 2023))

Search for: All records