While Terraform has gained popularity to implement the practice of infrastructure as code (IaC), there is a lack of characterization of static analysis for Terraform manifests. Such lack of characterization hinders practitioners to assess how to use static analysis for their Terraform development process, as it happened for Company A, an organization who uses Terraform to create automated software deployment pipelines. In this experience report, we have investigated 491 static analysis alerts that occur for 10 open source and one proprietary Terraform repositories. From our analysis we observe: (i) 10 categories of static analysis alerts to appear for Terraform manifests, of which five are related to security, (ii) Terraform resources with dependencies to have more static analysis alerts than that of resources with no dependencies, and (iii) practitioner perceptions to vary from one alert category to another while deciding on taking actions for reported alerts. We conclude our paper by providing a list of lessons for practitioners and toolsmiths on how to improve static analysis for Terraform manifests.
more »
« less
Finding 709 Defects in 258 Projects: An Experience Report on Applying CodeQL to Open-Source Embedded Software (Experience Paper)
This artifact contains the GitHub workflows to run CodeQL on EMBOSS repositories in our dataset, the results of running CodeQL on these repositories, and our manual analysis of CodeQL results.
more »
« less
- Award ID(s):
- 2247686
- PAR ID:
- 10600544
- Publisher / Repository:
- Zenodo
- Date Published:
- Format(s):
- Medium: X
- Right(s):
- Creative Commons Attribution 4.0 International
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Large language models (LLMs) are widely used in software development. However, the code generated by LLMs often contains vulnerabilities. Several secure code generation methods have been proposed to address this issue, but their current evaluation schemes leave several concerns unaddressed. Specifically, most existing studies evaluate security and functional correctness separately, using different datasets. That is, they assess vulnerabilities using securityrelated code datasets while validating functionality with general code datasets. In addition, prior research primarily relies on a single static analyzer, CodeQL, to detect vulnerabilities in generated code, which limits the scope of security evaluation. In this work, we conduct a comprehensive study to systematically assess the improvements introduced by four state-of-the-art secure code generation techniques. Specifically, we apply both security inspection and functionality validation to the same generated code and evaluate these two aspects together. We also employ three popular static analyzers and two LLMs to identify potential vulnerabilities in the generated code. Our study reveals that existing techniques often compromise the functionality of generated code to enhance security. Their overall performance remains limited when evaluating security and functionality together. In fact, many techniques even degrade the performance of the base LLM by more than 50%. Our further inspection reveals that these techniques often either remove vulnerable lines of code entirely or generate “garbage code” that is unrelated to the intended task. Moreover, the commonly used static analyzer CodeQL fails to detect several vulnerabilities, further obscuring the actual security improvements achieved by existing techniques. Our study serves as a guideline for a more rigorous and comprehensive evaluation of secure code generation performance in future work.more » « less
-
Existing malicious code detection techniques demand the integration of multiple tools to detect different malware patterns, often suffering from high misclassification rates. Therefore, malicious code detection techniques could be enhanced by adopting advanced, more automated approaches to achieve high accuracy and a low misclassification rate. The goal of this study is to aid security analysts in detecting malicious packages by empirically studying the effectiveness of Large Language Models (LLMs) in detecting malicious code. We present SocketAI, a malicious code review workflow to detect malicious code. To evaluate the effectiveness SocketAI, we leverage a benchmark dataset of 5,115 npm packages, of which 2,180 packages have malicious code. We conducted a baseline comparison of GPT-3 and GPT-4 models with the state-of-the-art CodeQL static analysis tool, using 39 custom CodeQL rules developed in prior research to detect malicious Javascript code. We also compare the effectiveness of static analysis as a pre-screener with SocketAI workflow, measuring the number of files that need to be analyzed and the associated costs. Additionally, we performed a qualitative study to understand the types of malicious packages detected or missed by our workflow. Our baseline comparison demonstrates a 16% and 9% improvement over static analysis in precision and F1 scores, respectively. GPT-4 achieves higher accuracy with 99% precision and 97% F1 scores, while GPT-3 offers a more cost-effective balance at 91% precision and 94% F1 scores. Prescreening files with a static analyzer reduces the number of files requiring LLM analysis by 77.9% and decreases costs by 60.9% for GPT-3 and 76.1% for GPT-4. Our qualitative analysis identified data theft, execution of arbitrary code, and suspicious domain categories as the top detected malicious packages.more » « less
-
With the rise in threats against the software supply chain, developer integrated development environments (IDEs) present an attractive target for attackers. For example, researchers have found extensions for Visual Studio Code (VS Code) that start web servers and can be exploited via JavaScript executing in a web browser on the developer's host. This paper seeks to systematically understand the landscape of vulnerabilities in VS Code's extension marketplace. We identify a set of four sources of untrusted input and three code targets that can be used for code injection and file integrity attacks and use them to design taint analysis rules in CodeQL. We then perform an ecosystem-level analysis of the VS Code extension marketplace, studying 25,402 extensions that contain code. Our results show that while vulnerabilities are not pervasive, they exist and impact millions of users. Specifically, we find 21 extensions with verified proof of concept exploits of code injection attacks impacting a total of over 6 million installations. Through this study, we demonstrate the need for greater attention to the security of IDE extensions.more » « less
-
Patil, Vishwas T; Krishnan, Ram; Shyamasundar, Rudrapatna K (Ed.)OSS is important and useful. We want to ensure that it is of high quality and has no security issues. Static analysis tools provide easy-to-use and application-independent mechanisms to assess various aspects of a given code. Many effective open-source static analysis tools exist. In this paper, we perform the first comprehensive analysis using 24 open-source static analysis tools (through Omega Analyzer) on 4,947 repositories. Our study identified several interesting findings, such as the distribution of errors in relation to the criticality score of repositories shows that repositories with a criticality score have the highest percentage of errors. We envision that our findings provide insights into the effectiveness of static analysis tools on OSS and future research directions in securing OSS repositories.more » « less
An official website of the United States government
