While CUDA has become a mainstream parallel computing platform and programming model for general-purpose GPU computing, how to effectively and efficiently detect CUDA synchronization bugs remains a challenging open problem. In this paper, we propose the first lightweight CUDA synchronization bug detection framework, namely Simulee, to model CUDA program execution by interpreting the corresponding LLVM bytecode and collecting the memory-access information for automatically detecting general CUDA synchronization bugs. To evaluate the effectiveness and efficiency of Simulee, we construct a benchmark with 7 popular CUDA-related projects from GitHub, upon which we conduct an extensive set of experiments. The experimental results suggest that Simulee can detect 21 out of the 24 manually identified bugs in our preliminary study and also 24 previously unknown bugs among all projects, 10 of which have already been confirmed by the developers. Furthermore, Simulee significantly outperforms state-of-the-art approaches for CUDA synchronization bug detection.
more »
« less
Generating precise error specifications for C: a zero shot learning approach
In C programs, error specifications, which specify the value range that each function returns to indicate failures, are widely used to check and propagate errors for the sake of reliability and security. Various kinds of C analyzers employ error specifications for different purposes, e.g., to detect error handling bugs, yet a general approach for generating precise specifications is still missing. This limits the applicability of those tools. In this paper, we solve this problem by developing a machine learning-based approach named MLPEx. It generates error specifications by analyzing only the source code, and is thus general. We propose a novel machine learning paradigm based on transfer learning, enabling MLPEx to require only one-time minimal data labeling from us (as the tool developers) and zero manual labeling efforts from users. To improve the accuracy of generated error specifications, MLPEx extracts and exploits project-specific information. We evaluate MLPEx on 10 projects, including 6 libraries and 4 applications. An investigation of 3,443 functions and 17,750 paths reveals that MLPEx generates error specifications with a precision of 91% and a recall of 94%, significantly higher than those of state-of-the-art approaches. To further demonstrate the usefulness of the generated error specifications, we use them to detect 57 bugs in 5 tested projects.
more »
« less
- Award ID(s):
- 1750886
- PAR ID:
- 10601780
- Publisher / Repository:
- Association for Computing Machinery (ACM)
- Date Published:
- Journal Name:
- Proceedings of the ACM on Programming Languages
- Volume:
- 3
- Issue:
- OOPSLA
- ISSN:
- 2475-1421
- Format(s):
- Medium: X Size: p. 1-30
- Size(s):
- p. 1-30
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Bonial, Claire; Bonn, Julia; Hwang, Jena D (Ed.)For many years, there has been attempts to compare predicate-argument labeling schemas between formalism, typically under the dependency assumptions (even if the annotation by these schemas could have been performed on either constituent-based specifications or dependency ones). Given the growing number of resources that link various lexical resources to one another, as well as thanks to parallel annotated corpora (with or without annotation), it is now possible to do more in-depth studies of those correspondences. We present here a high-coverage pilot study of mapping the labeling system used in PropBank (for English) to Czech, which has so far used mainly valency lexicons (in several closely related forms) for annotation projects, under a different level of specification and different theoretical assumptions. The purpose of this study is both theoretical (comparing the argument labeling schemes) and practical (to be able to annotate Czech under the standard UMR specifications).more » « less
-
null (Ed.)Bug localization plays an important role in software quality control. Many supervised machine learning models have been developed based on historical bug-fix information. Despite being successful, these methods often require sufficient historical data (i.e., labels), which is not always available especially for newly developed software projects. In response, cross-project bug localization techniques have recently emerged whose key idea is to transferring knowledge from label-rich source project to locate bugs in the target project. However, a major limitation of these existing techniques lies in that they fail to capture the specificity of each individual project, and are thus prone to negative transfer.To address this issue, we propose an adversarial transfer learning bug localization approach, focusing on only transferring the common characteristics (i.e., public information) across projects. Specifically, our approach (CooBa) learns the indicative public information from cross-project bug reports through a shared encoder, and extracts the private information from code files by an individual feature extractor for each project. CooBa further incorporates adversarial learning mechanism to ensure that public information shared between multiple projects could be effectively extracted. Extensive experiments on four large-scale real-world data sets demonstrate that the proposed CooBa significantly outperforms the state of the art techniques.more » « less
-
Abstract This paper presents a new approach to improve static program analysis using Large Language Models (LLMs). The approachinterleavescalls to the static analyzer and queries to the LLM. The query to the LLM is constructed based on intermediate results from the static analysis, and subsequent static analysis uses the results from the LLM query. We apply our approach to the problem oferror-specification inference: given systems code written in C, infer the set of values that each function can return on error. Such error specifications aid in program understanding and can be used to find error-handling bugs. We implemented our approach by incorporating LLMs into EESI, the state-of-the-art static analysis for error-specification inference. Compared to EESI, our approach achieves higher recall (from an average of 52.55% to 77.83%) and higher F1-score (from an average of 0.612 to 0.804) while maintaining precision (from an average of 86.67% to 85.12%) on real-world benchmarks such as Apache HTTPD and MbedTLS. We also conducted experiments to understand the sources of imprecision in our LLM-assisted analysis as well as the impact of LLM nondeterminism on the analysis results.more » « less
-
Abstract: Surface acoustic wave (SAW) sensors with increasingly unique and refined designed patterns are often developed using the lithographic fabrication processes. Emerging applications of SAW sensors often require novel materials, which may present uncharted fabrication outcomes. The fidelity of the SAW sensor performance is often correlated with the ability to restrict the presence of defects in post-fabrication. Therefore, it is critical to have effective means to detect the presence of defects within the SAW sensor. However, labor-intensive manual labeling is often required due to the need for precision identification and classification of surface features for increased confidence in model accuracy. One approach to automating defect detection is to leverage effective machine learning techniques to analyze and quantify defects within the SAW sensor. In this paper, we propose a machine learning approach using a deep convolutional autoencoder to segment surface features semantically. The proposed deep image autoencoder takes a grayscale input image and generates a color image segmenting the defect region in red, metallic interdigital transducing (IDT) fingers in green, and the substrate region in blue. Experimental results demonstrate promising segmentation scores in locating the defects and regions of interest for a novel SAW sensor variant. The proposed method can automate the process of localizing and measuring post-fabrication defects at the pixel level that may be missed by error-prone visual inspection.more » « less
An official website of the United States government
