Title: Minotaur: A SIMD-Oriented Synthesizing Superoptimizer
A superoptimizing compiler—-one that performs a meaningful search of the program space as part of the optimization process—-can find optimization opportunities that are missed by even the best existing optimizing compilers. We created Minotaur: a superoptimizer for LLVM that uses program synthesis to improve its code generation, focusing on integer and floating-point SIMD code. On an Intel Cascade Lake processor, Minotaur achieves an average speedup of 7.3% on the GNU Multiple Precision library (GMP)’s benchmark suite, with a maximum speedup of 13%. On SPEC CPU 2017, our superoptimizer produces an average speedup of 1.5%, with a maximum speedup of 4.5% for 638.imagick. Every optimization produced by Minotaur has been formally verified, and several optimizations that it has discovered have been implemented in LLVM as a result of our work. more »« less
Barua, Prithayan; Zhao, Jisheng; Sarkar, Vivek
(, European Conference on Parallel Processing (Euro-Par 2020))
null
(Ed.)
The fast development of acceleration architectures and applications has made heterogeneous computing the norm for high- performance computing. The cost of high volume data movement to the accelerators is an important bottleneck both in terms of application performance and developer productivity. Memory management is still a manual task performed tediously by expert programmers. In this paper, we develop a compiler analysis to automate memory management for heterogeneous computing. We propose an optimization framework that casts the problem of detection and removal of redundant data move- ments into a partial redundancy elimination (PRE) problem and applies the lazy code motion technique to optimize these data movements. We chose OpenMP as the underlying parallel programming model and imple- mented our optimization framework in the LLVM toolchain. We evalu- ated it with ten benchmarks and obtained a geometric speedup of 2.3×, and reduced on average 50% of the total bytes transferred between the host and GPU.
Mukherjee, Manasij; Regehr, John
(, Proceedings of the ACM on Programming Languages)
Optimizing compilers rely on peephole optimizations to simplify combinations of instructions and remove redundant instructions. Typically, a new peephole optimization is added when a compiler developer notices an optimization opportunity---a collection of dependent instructions that can be improved---and manually derives a more general rewrite rule that optimizes not only the original code, but also other, similar collections of instructions. In this paper, we present Hydra, a tool that automates the process of generalizing peephole optimizations using a collection of techniques centered on program synthesis. One of the most important problems we have solved is finding a version of each optimization that is independent of the bitwidths of the optimization's inputs (when this version exists). We show that Hydra can generalize 75% of the ungeneralized missed peephole optimizations that LLVM developers have posted to the LLVM project's issue tracker. All of Hydra's generalized peephole optimizations have been formally verified, and furthermore we can automatically turn them into C++ code that is suitable for inclusion in an LLVM pass.
Huang, Shan; Liang, Jingjing; Su, Ting; Zhang, Qirun
(, Proceedings of the ACM on Programming Languages)
Optimizing compilers, such as LLVM, generatedebug informationin machine code to aid debugging. This information is particularly important when debugging optimized code, as modern software is often compiled with optimization enabled. However, properly updating debug information to reflect code transformations during optimization is a complex task that often relies on manual effort. This complexity makes the process prone to errors, which can lead to incorrect or lost debug information. Finding and fixing potential debug information update errors is vital to maintaining the accuracy and reliability of the overall debugging process. To our knowledge, no existing techniques can rectify debug information update errors in LLVM. While black-box testing approaches can find such bugs, they can neither pinpoint the root causes nor suggest fixes. To fill the gap, we propose thefirsttechnique torobustifydebug information updates in LLVM. In particular, our robustification approach can find and fix incorrect debug location updates. Central to our approach is the observation that the debug locations in the original and optimized programs must satisfy aconformance relation. The relation ensures that LLVM optimizations do not introduce extraneous debug location information on the control-flow paths of the optimized programs. We introducecontrol-flow conformance analysis, a novel approach that determines the reference updates ensuring the conformance relation by observing the execution of LLVM optimization passes and analyzing the debug locations in the control-flow graphs of programs under optimization. The determined reference updates are then used to check developer-written updates in LLVM. When discrepancies arise, the reference updates serve as the update skeletons to guide the fixing. We realized our approach as a tool named MetaLoc, which determines proper debug location updates for LLVM optimizations. More importantly, with MetaLoc, we have reported and patched 46 previously unknown update errors in LLVM. All the patches, along with 22 new regression tests, have been merged into the LLVM codebase, effectively improving the accuracy and reliability of debug information in all programs optimized by LLVM. Furthermore, our approach uncovered and led to corrections in two issues within LLVM’s official documentation on debug information updates.
Moses, William; Churavy, Valentin
(, Advances in neural information processing systems)
Applying differentiable programming techniques and machine learning algorithms to foreign programs requires developers to either rewrite their code in a machine learning framework, or otherwise provide derivatives of the foreign code. This paper presents Enzyme, a high-performance automatic differentiation (AD) compiler plugin for the LLVM compiler framework capable of synthesizing gradients of statically analyzable programs expressed in the LLVM intermediate representation (IR). Enzyme synthesizes gradients for programs written in any language whose compiler targets LLVM IR including C, C++, Fortran, Julia, Rust, Swift, MLIR, etc., thereby providing native AD capabilities in these languages. Unlike traditional source-to-source and operator-overloading tools, Enzyme performs AD on optimized IR. On a machine-learning focused benchmark suite including Microsoft's ADBench, AD on optimized IR achieves a geometric mean speedup of 4.2 times over AD on IR before optimization allowing Enzyme to achieve state-of-the-art performance. Packaging Enzyme for PyTorch and TensorFlow provides convenient access to gradients of foreign code with state-of-the-art performance, enabling foreign code to be directly incorporated into existing machine learning workflows.
Yang, Chenyuan; Deng, Yinlin; Lu, Runyu; Yao, Jiayi; Liu, Jiawei; Jabbarvand, Reyhaneh; Zhang, Lingming
(, Proceedings of the ACM on Programming Languages)
Compiler correctness is crucial, as miscompilation can falsify program behaviors, leading to serious consequences over the software supply chain. In the literature, fuzzing has been extensively studied to uncover compiler defects. However, compiler fuzzing remains challenging: Existing arts focus on black- and grey-box fuzzing, which generates test programs without sufficient understanding of internal compiler behaviors. As such, they often fail to construct test programs to exercise intricate optimizations. Meanwhile, traditional white-box techniques, such as symbolic execution, are computationally inapplicable to the giant codebase of compiler systems. Recent advances demonstrate that Large Language Models (LLMs) excel in code generation/understanding tasks and even have achieved state-of-the-art performance in black-box fuzzing. Nonetheless, guiding LLMs with compiler source-code information remains a missing piece of research in compiler testing. To this end, we propose WhiteFox, the first white-box compiler fuzzer using LLMs with source-code information to test compiler optimization, with a spotlight on detecting deep logic bugs in the emerging deep learning (DL) compilers. WhiteFox adopts a multi-agent framework: (i) an LLM-based analysis agent examines the low-level optimization source code and produces requirements on the high-level test programs that can trigger the optimization; (ii) an LLM-based generation agent produces test programs based on the summarized requirements. Additionally, optimization-triggering tests are also used as feedback to further enhance the test generation prompt on the fly. Our evaluation on the three most popular DL compilers (i.e., PyTorch Inductor, TensorFlow-XLA, and TensorFlow Lite) shows that WhiteFox can generate high-quality test programs to exercise deep optimizations requiring intricate conditions, practicing up to 8 times more optimizations than state-of-the-art fuzzers. To date, WhiteFox has found in total 101 bugs for the compilers under test, with 92 confirmed as previously unknown and 70 already fixed. Notably, WhiteFox has been recently acknowledged by the PyTorch team, and is in the process of being incorporated into its development workflow. Finally, beyond DL compilers, WhiteFox can also be adapted for compilers in different domains, such as LLVM, where WhiteFox has already found multiple bugs.
Liu, Zhengyang, Mada, Stefan, and Regehr, John. Minotaur: A SIMD-Oriented Synthesizing Superoptimizer. Retrieved from https://par.nsf.gov/biblio/10646970. Proceedings of the ACM on Programming Languages 8.OOPSLA2 Web. doi:10.1145/3689766.
Liu, Zhengyang, Mada, Stefan, & Regehr, John. Minotaur: A SIMD-Oriented Synthesizing Superoptimizer. Proceedings of the ACM on Programming Languages, 8 (OOPSLA2). Retrieved from https://par.nsf.gov/biblio/10646970. https://doi.org/10.1145/3689766
Liu, Zhengyang, Mada, Stefan, and Regehr, John.
"Minotaur: A SIMD-Oriented Synthesizing Superoptimizer". Proceedings of the ACM on Programming Languages 8 (OOPSLA2). Country unknown/Code not available: ACM. https://doi.org/10.1145/3689766.https://par.nsf.gov/biblio/10646970.
@article{osti_10646970,
place = {Country unknown/Code not available},
title = {Minotaur: A SIMD-Oriented Synthesizing Superoptimizer},
url = {https://par.nsf.gov/biblio/10646970},
DOI = {10.1145/3689766},
abstractNote = {A superoptimizing compiler—-one that performs a meaningful search of the program space as part of the optimization process—-can find optimization opportunities that are missed by even the best existing optimizing compilers. We created Minotaur: a superoptimizer for LLVM that uses program synthesis to improve its code generation, focusing on integer and floating-point SIMD code. On an Intel Cascade Lake processor, Minotaur achieves an average speedup of 7.3% on the GNU Multiple Precision library (GMP)’s benchmark suite, with a maximum speedup of 13%. On SPEC CPU 2017, our superoptimizer produces an average speedup of 1.5%, with a maximum speedup of 4.5% for 638.imagick. Every optimization produced by Minotaur has been formally verified, and several optimizations that it has discovered have been implemented in LLVM as a result of our work.},
journal = {Proceedings of the ACM on Programming Languages},
volume = {8},
number = {OOPSLA2},
publisher = {ACM},
author = {Liu, Zhengyang and Mada, Stefan and Regehr, John},
}
Warning: Leaving National Science Foundation Website
You are now leaving the National Science Foundation website to go to a non-government website.
Website:
NSF takes no responsibility for and exercises no control over the views expressed or the accuracy of
the information contained on this site. Also be aware that NSF's privacy policy does not apply to this site.