NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automatically Detecting Numerical Instability in Machine Learning Applications via Soft Assertions

https://doi.org/10.1145/3729394

Sharmin, Shaila; Zahid, Anwar Hossain; Bhattacharjee, Subhankar; Igwilo, Chiamaka; Kim, Miryung; Le, Wei (June 2025, Proceedings of the ACM on Software Engineering)

Machine learning (ML) applications have become an integral part of our lives. ML applications extensively use floating-point computation and involve very large/small numbers; thus, maintaining the numerical stability of such complex computations remains an important challenge. Numerical bugs can lead to system crashes, incorrect output, and wasted computing resources. In this paper, we introduce a novel idea, namelysoft assertions (SA), to encode safety/error conditions for the places where numerical instability can occur. A soft assertion is an ML model automatically trained using the dataset obtained during unit testing of unstable functions. Given the values at the unstable function in an ML application, a soft assertion reports how to change these values in order to trigger the instability. We then use the output of soft assertions as signals to effectively mutate inputs to trigger numerical instability in ML applications. In the evaluation, we used the GRIST benchmark, a total of 79 programs, as well as 15 real-world ML applications from GitHub. We compared our tool with 5 state-of-the-art (SOTA) fuzzers. We found all the GRIST bugs and outperformed the baselines. We found 13 numerical bugs in real-world code, one of which had already been confirmed by the GitHub developers. While the baselines mostly found the bugs that report NaN and INF, our tool found numerical bugs with incorrect output. We showed one case where theTumor Detection Model, trained on Brain MRI images, should have predicted ”tumor”, but instead, it incorrectly predicted ”no tumor” due to the numerical bugs. Our replication package is located at https://figshare.com/s/6528d21ccd28bea94c32.
more » « less
Full Text Available
Closing the Gap: A User Study on the Real-world Usefulness of AI-powered Vulnerability Detection & Repair in the IDE

https://doi.org/10.1109/ICSE55347.2025.00126

Steenhoek, Benjamin; Sivaraman, Kalpathy; Gonzalez, Renata Saldivar; Mohylevskyy, Yevhen; Moghaddam, Roshanak Zilouchian; Le, Wei (April 2025, IEEE)

Full Text Available
From Pseudo-Code to Source Code: A Self-Supervised Search Approach

Kulkarni, Adithya; Chakraborty, Mohna; Sium, Yonas Afewerki; Valluri, Sai Charishma; Le, Wei; Li, Qi (April 2025, ICLR 2025 Third Workshop on Deep Learning for Code)

Full Text Available
Towards Causal Deep Learning for Vulnerability Detection

https://doi.org/10.1145/3597503.3639170

Rahman, Md Mahbubur; Ceka, Ira; Mao, Chengzhi; Chakraborty, Saikat; Ray, Baishakhi; Le, Wei (April 2024, ACM)

Full Text Available
Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection

https://doi.org/10.1145/3597503.3623345

Steenhoek, Benjamin; Gao, Hongyang; Le, Wei (February 2024, ACM)

Full Text Available
TRACED: Execution-aware Pre-training for Source Code

https://doi.org/10.1145/3597503.3608140

Ding, Yangruibo; Steenhoek, Benjamin; Pei, Kexin; Kaiser, Gail; Le, Wei; Ray, Baishakhi (February 2024, ACM)

Most existing pre-trained language models for source code focus on learning the static code text, typically augmented with static code structures (abstract syntax tree, dependency graphs, etc.). However, program semantics will not be fully exposed before the real execution. Without an understanding of the program execution, statically pre-trained models fail to comprehensively capture the dynamic code properties, such as the branch coverage and the runtime variable values, and they are consequently less effective at code understanding tasks, such as retrieving semantic clones and detecting software vulnerabilities. To close the gap between the static nature of language models and the dynamic characteristics of programs, we introduce TRACED, an execution-aware pre-training strategy for source code. Specifically, we pre-train code language models with a combination of source code, executable inputs, and corresponding execution traces. Our goal is to teach code models the complicated execution logic during the pre-training, enabling the model to statically estimate the dynamic code properties without repeatedly executing code during task-specific fine-tuning. To illustrate the effectiveness of our proposed approach, we fine-tune and evaluate TRACED on three downstream tasks: static execution estimation, clone retrieval, and vulnerability detection. The empirical results show that TRACED relatively improves the statically pre-trained code models by 12.4% for complete execution path prediction and by 25.2% for runtime variable value predictions. TRACED also significantly outperforms statically pre-trained models in clone retrieval and vulnerability detection across four public benchmarks.
more » « less
Full Text Available

Search for: All records