skip to main content


Search for: All records

Creators/Authors contains: "Ragan-Kelley, Jonathan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Producing efficient array code is crucial in high-performance domains like image processing and machine learning. It requires the ability to control factors like compute intensity and locality by reordering computations into different stages and granularities with respect to where they are stored. However, traditional pure, functional tensor languages struggle to do so. In a previous publication, we introduced ATL as a pure, functional tensor language capable of systematically decoupling compute and storage order via a set of high-level combinators known as reshape operators. Reshape operators are a unique functional-programming construct since they manipulate storage location in the generated code by modifying the indices that appear on the left-hand sides of storage expressions. We present a formal correctness proof for an implementation of the compilation algorithm, marking the first verification of a lowering algorithm targeting imperative loop nests from a source functional language that enables separate control of compute and storage ordering. One of the core difficulties of this proof required properly formulating the complex invariants to ensure that these storage-index remappings were well-formed. Notably, this exercise revealed a soundness bug in the original published compilation algorithm regarding the truncation reshape operators. Our fix is a new type system that captures safety conditions that were previously implicit and enables us to prove compiler correctness for well-typed source programs. We evaluate this type system and compiler implementation on a range of common programs and optimizations, including but not limited to those previously studied to demonstrate performance comparable to established compilers like Halide. 
    more » « less
    Free, publicly-accessible full text available June 20, 2025
  2. Computations in physical simulation, computer graphics, and probabilistic inference often require the differentiation of discontinuous processes due to contact, occlusion, and changes at a point in time. Popular differentiable programming languages, such as PyTorch and JAX, ignore discontinuities during differentiation. This is incorrect forparametric discontinuities—conditionals containing at least one real-valued parameter and at least one variable of integration. We introduce Potto, the first differentiable first-order programming language to soundly differentiate parametric discontinuities. We present a denotational semantics for programs and program derivatives and show the two accord. We describe the implementation of Potto, which enables separate compilation of programs. Our prototype implementation overcomes previous compile-time bottlenecks achieving an 88.1x and 441.2x speed up in compile time and a 2.5x and 7.9x speed up in runtime, respectively, on two increasingly large image stylization benchmarks. We showcase Potto by implementing a prototype differentiable renderer with separately compiled shaders.

     
    more » « less
    Free, publicly-accessible full text available April 29, 2025
  3. A single panel of a comic book can say a lot: it can depict not only where the characters currently are, but also their motions, their motivations, their emotions, and what they might do next. More generally, humans routinely infer complex sequences of past and future events from a static snapshot of a dynamic scene, even in situations they have never seen before. In this paper, we model how humans make such rapid and flexible inferences. Building on a long line of work in cognitive science, we offer a Monte Carlo algorithm whose inferences correlate well with human intuitions in a wide variety of domains, while only using a small, cognitively-plausible number of samples. Our key technical insight is a surprising connection between our inference problem and Monte Carlo path tracing, which allows us to apply decades of ideas from the computer graphics community to this seemingly-unrelated theory of mind task. 
    more » « less
  4. How much can you say about the gradient of a neural network without computing a loss or knowing the label? This may sound like a strange question: surely the answer is “very little.” However, in this paper, we show that gradients are more structured than previously thought. Gradients lie in a predictable low-dimensional subspace which depends on the network architecture and incoming features. Exploiting this structure can significantly improve gradient-free optimization schemes based on directional derivatives, which have struggled to scale beyond small networks trained on toy datasets. We study how to narrow the gap in optimization performance between methods that calculate exact gradients and those that use directional derivatives. Furthermore, we highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients. 
    more » « less
  5. We introduce SLANG.D, an extension to the Slang shading language that incorporates first-class automatic differentiation support. The new shading language allows us to transform a Direct3D-based path tracer to be fully differentiable with minor modifications to existing code. SLANG.D enables a shared ecosystem between machine learning frameworks and pre-existing graphics hardware API-based rendering systems, promoting the interchange of components and ideas across these two domains.

    Our contributions include a differentiable type system designed to ensure type safety and semantic clarity in codebases that blend differentiable and non-differentiable code, language primitives that automatically generate both forward and reverse gradient propagation methods, and a compiler architecture that generates efficient derivative propagation shader code for graphics pipelines. Our compiler supports differentiating code that involves arbitrary control-flow, dynamic dispatch, generics and higher-order differentiation, while providing developers flexible control of checkpointing and gradient aggregation strategies for best performance. Our system allows us to differentiate an existing real-time path tracer, Falcor, with minimal change to its shader code. We show that the compiler-generated derivative kernels perform as efficiently as handwritten ones. In several benchmarks, the SLANG.D code achieves significant speedup when compared to prior automatic differentiation systems.

     
    more » « less
  6. Great storytellers know how to take us on a journey. They direct characters to act—not necessarily in the most rational way—but rather in a way that leads to interesting situations, and ultimately creates an impactful experience for audience members looking on. If audience experience is what matters most, then can we help artists and animators directly craft such experiences, independent of the concrete character actions needed to evoke those experiences? In this paper, we offer a novel computational framework for such tools. Our key idea is to optimize animations with respect to simulated audience members’ experiences. To simulate the audience, we borrow an established principle from cognitive science: that human social intuition can be modeled as “inverse planning,” the task of inferring an agent’s (hidden) goals from its (observed) actions. Building on this model, we treat storytelling as “inverse inverse planning,” the task of choosing actions to manipulate an inverse planner’s inferences. Our framework is grounded in literary theory, naturally capturing many storytelling elements from first principles. We give a series of examples to demonstrate this, with supporting evidence from human subject studies. 
    more » « less
  7. Machine knitting is a well-established fabrication technique for complex soft objects, and both companies and researchers have developed tools for generating machine knitting patterns. However, existing representations for machine knitted objects are incomplete (do not cover the complete domain of machine knittable objects) or overly specific (do not account for symmetries and equivalences among knitting instruction sequences). This makes it difficult to define correctness in machine knitting, let alone verify the correctness of a given program or program transformation. The major contribution of this work is a formal semantics for knitout, a low-level Domain Specific Language for knitting machines. We accomplish this by using what we call the "fenced tangle," which extends concepts from knot theory to allow for a mathematical definition of knitting program equivalence that matches the intuition behind knit objects. Finally, using this formal representation, we prove the correctness of a sequence of rewrite rules; and demonstrate how these rewrite rules can form the foundation for higher-level tasks such as compiling a program for a specific machine and optimizing for time/reliability, all while provably generating the same knit object under our proposed semantics. By establishing formal definitions of correctness, this work provides a strong foundation for compiling and optimizing knit programs. 
    more » « less
  8. Modern vector processors support a wide variety of instructions for fixed-point digital signal processing. These instructions support a proliferation of rounding, saturating, and type conversion modes, and are often fused combinations of more primitive operations. While these are common idioms in fixed-point signal processing, it is difficult to use these operations in portable code. It is challenging for programmers to write down portable integer arithmetic in a C-like language that corresponds exactly to one of these instructions, and even more challenging for compilers to recognize when these instructions can be used. Our system, Pitchfork, defines a portable fixed-point intermediate representation, FPIR, that captures common idioms in fixed-point code. FPIR can be used directly by programmers experienced with fixed-point, or Pitchfork can automatically lift from integer operations into FPIR using a term-rewriting system (TRS) composed of verified manual and automatically-synthesized rules. Pitchfork then lowers from FPIR into target-specific fixed-point instructions using a set of target-specific TRSs. We show that this approach improves runtime performance of portably-written fixed-point signal processing code in Halide, across a range of benchmarks, by geomean 1.31× on x86 with AVX2, 1.82× on ARM Neon, and 2.44× on Hexagon HVX compared to a standard LLVM-based compiler flow, while maintaining or improving existing compile times. 
    more » « less
  9. Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as its step size. Recent work has shown how the step size can itself be optimized alongside the model parameters by manually deriving expressions for "hypergradients" ahead of time. We show how to automatically compute hypergradients with a simple and elegant modification to backpropagation. This allows us to easily apply the method to other optimizers and hyperparameters (eg momentum coefficients). We can even recursively apply the method to its own hyper-hyperparameters, and so on ad infinitum. As these towers of optimizers grow taller, they become less sensitive to the initial choice of hyperparameters. We present experiments validating this for MLPs, CNNs, and RNNs. Finally, we provide a simple PyTorch implementation of this algorithm (see http://people. csail. mit. edu/kach/gradient-descent-the-ultimate-optimizer). 
    more » « less