NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Designing Perceptual Puzzles by Differentiating Probabilistic Programs

https://doi.org/10.1145/3528233.3530715

Chandra, Kartik; Li, Tzu-Mao; Tenenbaum, Joshua; Ragan-Kelley, Jonathan (August 2022, ACM SIGGRAPH Conference Proceedings)

Full Text Available
Exocompilation for productive programming of hardware accelerators

https://doi.org/10.1145/3519939.3523446

Ikarashi, Yuka; Bernstein, Gilbert Louis; Reinking, Alex; Genc, Hasan; Ragan-Kelley, Jonathan (June 2022, Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation)

High-performance kernel libraries are critical to exploiting accelerators and specialized instructions in many applications. Because compilers are difficult to extend to support diverse and rapidly-evolving hardware targets, and automatic optimization is often insufficient to guarantee state-of-the-art performance, these libraries are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration state to be defined in user libraries. It builds on the idea of user scheduling to externalize hardware mapping and optimization decisions. Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. We show that Exo enables rapid development of state-of-the-art matrix-matrix multiply and convolutional neural network kernels, for both an embedded neural accelerator and x86 with AVX-512 extensions, in a few dozen lines of code each.
more » « less
Full Text Available
Verified tensor-program optimization via high-level scheduling rewrites

https://doi.org/10.1145/3498717

Liu, Amanda; Bernstein, Gilbert_Louis; Chlipala, Adam; Ragan-Kelley, Jonathan (January 2022, Proceedings of the ACM on Programming Languages)

We present a lightweight Coq framework for optimizing tensor kernels written in a pure, functional array language. Optimizations rely on user scheduling using series of verified, semantics-preserving rewrites. Unusually for compilation targeting imperative code with arrays and nested loops, all rewrites are source-to-source within a purely functional language. Our language comprises a set of core constructs for expressing high-level computation detail and a set of what we call reshape operators, which can be derived from core constructs but trigger low-level decisions about storage patterns and ordering. We demonstrate that not only is this system capable of deriving the optimizations of existing state-of-the-art languages like Halide and generating comparably performant code, it is also able to schedule a family of useful program transformations beyond what is reachable in Halide.
more » « less
Efficient Automatic Scheduling of Imaging and Vision Pipelines for the GPU

https://doi.org/10.1145/3485486

Anderson, Luke; Adams, Andrew; Ma, Karima; Li, Tzu-Mao; Jin, Tian; Ragan-Kelley, Jonathan (October 2021, Proceedings of the ACM on programming languages)

Full Text Available
Systematically Differentiating Parametric Discontinuities

https://doi.org/10.1145/3450626.3459775

Bangaru, Sai Praveen; Michel, Jesse; Mu, Kevin; Bernstein, Gilbert; Li, Tzu-Mao; Ragan-Kelley, Jonathan (August 2021, ACM transactions on graphics)

Emerging research in computer graphics, inverse problems, and machine learning requires us to differentiate and optimize parametric discontinuities. These discontinuities appear in object boundaries, occlusion, contact, and sudden change over time. In many domains, such as rendering and physics simulation, we differentiate the parameters of models that are expressed as integrals over discontinuous functions. Ignoring the discontinuities during differentiation often has a significant impact on the optimization process. Previous approaches either apply specialized hand-derived solutions, smooth out the discontinuities, or rely on incorrect automatic differentiation. We propose a systematic approach to differentiating integrals with discontinuous integrands, by developing a new differentiable programming language. We introduce integration as a language primitive and account for the Dirac delta contribution from differentiating parametric discontinuities in the integrand. We formally define the language semantics and prove the correctness and closure under the differentiation, allowing the generation of gradients and higher-order derivatives. We also build a system, Teg, implementing these semantics. Our approach is widely applicable to a variety of tasks, including image stylization, fitting shader parameters, trajectory optimization, and optimizing physical designs.
more » « less
Full Text Available
DiffTaichi: Differentiable Programming for Physical Simulation

Hu, Yuanming; Anderson, Luke; Li, Tzu-Mao; Sun, Qi; Carr, Nathan; Ragan-Kelley, Jonathan; Durand, Frédo (January 2020, International Conference on Learning Representations)

We present DiffTaichi, a new differentiable programming language tailored for building high-performance differentiable physical simulators. Based on an imperative programming language, DiffTaichi generates gradients of simulation steps using source code transformations that preserve arithmetic intensity and parallelism. A light-weight tape is used to record the whole simulation program structure and replay the gradient kernels in a reversed order, for end-to-end backpropagation. We demonstrate the performance and productivity of our language in gradient-based learning and optimization tasks on 10 different physical simulators. For example, a differentiable elastic object simulator written in our language is 4.2x shorter than the hand-engineered CUDA version yet runs as fast, and is 188x faster than the TensorFlow implementation. Using our differentiable programs, neural network controllers are typically optimized within only tens of iterations.
more » « less
Full Text Available

Search for: All records