Memory profiling captures programs’ dynamic memory behavior, assisting programmers in debugging, tuning, and enabling advanced compiler optimizations like speculation-based automatic parallelization. As each use case demands its unique program trace summary, various memory profiler types have been developed. Yet, designing practical memory profilers often requires extensive compiler expertise, adeptness in program optimization, and significant implementation effort. This often results in a void where aspirations for fast and robust profilers remain unfulfilled. To bridge this gap, this paper presents PROMPT, a framework for streamlined development of fast memory profilers. With PROMPT, developers need only specify profiling events and define the core profiling logic, bypassing the complexities of custom instrumentation and intricate memory profiling components and optimizations. Two state-of-the-art memory profilers were ported with PROMPT where all features preserved. By focusing on the core profiling logic, the code was reduced by more than 65% and the profiling overhead was improved by 5.3× and 7.1× respectively. To further underscore PROMPT’s impact, a tailored memory profiling workflow was constructed for a sophisticated compiler optimization client. In 570 lines of code, this redesigned workflow satisfies the client’s memory profiling needs while achieving more than 90% reduction in profiling overhead and improved robustness compared to the original profilers.
more »
« less
This content will become publicly available on June 10, 2026
Reductive Analysis with Compiler-Guided Large Language Models for Input-Centric Code Optimizations
Input-centric program optimization aims to optimize code by considering the relations between program inputs and program behaviors. Despite its promise, a long-standing barrier for its adoption is the difficulty of automatically identifying critical features of complex inputs. This paper introduces a novel technique,reductive analysis through compiler-guided Large Language Models (LLMs), to solve the problem through a synergy between compilers and LLMs. It uses a reductive approach to overcome the scalability and other limitations of LLMs in program code analysis. The solution, for the first time, automates the identification of critical input features without heavy instrumentation or profiling, cutting the time needed for input identification by 44× (or 450× for local LLMs), reduced from 9.6 hours to 13 minutes (with remote LLMs) or 77 seconds (with local LLMs) on average, making input characterization possible to be integrated into the workflow of program compilations. Optimizations on those identified input features show similar or even better results than those identified by previous profiling-based methods, leading to optimizations that yield 92.6% accuracy in selecting the appropriate adaptive OpenMP parallelization decisions, and 20–30% performance improvement of serverless computing while reducing resource usage by 50–60%.
more »
« less
- PAR ID:
- 10650639
- Publisher / Repository:
- ACM Digital Library
- Date Published:
- Journal Name:
- Proceedings of the ACM on Programming Languages
- Volume:
- 9
- Issue:
- PLDI
- ISSN:
- 2475-1421
- Page Range / eLocation ID:
- 797 to 821
- Subject(s) / Keyword(s):
- Large Language Models, Program Optimization, Input-Centric Optimization, Seminal Behavior Identification, Predictive Modeling
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Greybox fuzzing and mutation testing are two popular but mostly independent fields of software testing research that have so far had limited overlap. Greybox fuzzing, generally geared towards searching for new bugs, predominantly uses code coverage for selecting inputs to save. Mutation testing is primarily used as a stronger alternative to code coverage in assessing the quality of regression tests; the idea is to evaluate tests for their ability to identify artificially injected faults in the target program. But what if we wanted to use greybox fuzzing to synthesize high-quality regression tests? In this paper, we develop and evaluate Mu2, a Java-based framework for incorporating mutation analysis in the greybox fuzzing loop, with the goal of producing a test-input corpus with a high mutation score. Mu2 makes use of a differential oracle for identifying inputs that exercise interesting program behavior without causing crashes. This paper describes several dynamic optimizations implemented in Mu2 to overcome the high cost of performing mutation analysis with every fuzzer-generated input. These optimizations introduce trade-offs in fuzzing throughput and mutation killing ability, which we evaluate empirically on five real-world Java benchmarks. Overall, variants of Mu2 are able to synthesize test-input corpora with a higher mutation score than state-of-the-art Java fuzzer Zest.more » « less
-
When software engineering researchers discuss "similar" code, we often mean code determined by static analysis to be textually, syntactically or structurally similar, known as code clones (looks alike). Ideally, we would like to also include code that is behaviorally or functionally similar, even if it looks completely different. The state of the art in detecting these behavioral clones focuses on checking the functional equivalence of the inputs and outputs of code fragments, regardless of its internal behavior (focusing only on input and output states). We argue that with an advance in dynamic code clone detection towards detecting behavioral clones (i.e., those with similar execution behavior), we can greatly increase the applications of behavioral clones as a whole for general program understanding tasks.more » « less
-
When software engineering researchers discuss "similar" code, we often mean code determined by static analysis to be textually, syntactically or structurally similar, known as code clones (looks alike). Ideally, we would like to also include code that is behaviorally or functionally similar, even if it looks completely different. The state of the art in detecting these behavioral clones focuses on checking the functional equivalence of the inputs and outputs of code fragments, regardless of its internal behavior (focusing only on input and output states). We argue that with an advance in dynamic code clone detection towards detecting behavioral clones (i.e., those with similar execution behavior), we can greatly increase the applications of behavioral clones as a whole for general program understanding tasks.more » « less
-
Deep Neural Networks (DNN) are increasingly used in a variety of applications, many of them with serious safety and security concerns. This paper describes DeepCheck, a new approach for validating DNNs based on core ideas from program analysis, specifically from symbolic execution. DeepCheck implements novel techniques for lightweight symbolic analysis of DNNs and applies them to address two challenging problems in DNN analysis: 1) identification of important input features and 2) leveraging those features to create adversarial inputs. Experimental results with an MNIST image classification network and a sentiment network for textual data show that DeepCheck promises to be a valuable tool for DNN analysis.more » « less
An official website of the United States government
