Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
As the reliance on open-source software dependencies increases, managing the security vulnerabilities in these dependencies becomes complex. State-of-the-art industry tools use reachability analysis of code to alert developers when security vulnerabilities in dependencies are likely to impact their projects. These tools heavily rely on precisely identifying the location of the vulnerability within the dependency, specifically vulnerable functions. However, the process of identifying vulnerable functions is currently either manual or uses a naive automated approach that falsely assumes all changed functions in a security patch link are vulnerable. In this paper, we explore using open-source large language models (LLMs) to improve pairing security advisories with vulnerable functions. We explore various prompting strategies, learning paradigms (i.e., zero-shot vs. few-shot), and show our approach generalizes to other open-source LLMs. Compared to the naive automated approach, we show a 173% increase in precision while only having an 18% decrease in recall. The significant increase in precision to enhance vulnerable function identification lays the groundwork for downstream techniques that depend on this critical information for security analysis and threat mitigation.more » « less
-
Security advisories are the primary channel of communication for discovered vulnerabilities in open-source software, but they often lack crucial information. Specifically, 63% of vulnerability database reports are missing their patch links, also referred to as vulnerability fixing commits (VFCs). This paper introduces VFCFinder, a tool that generates the top-five ranked set of VFCs for a given security advisory using Natural Language Programming Language (NL-PL) models. VFCFinder yields a 96.6% recall for finding the correct VFC within the Top-5 commits, and an 80.0% recall for the Top-1 ranked commit. VFCFinder generalizes to nine different programming languages and outperforms state-of-the-art approaches by 36 percentage points in terms of Top-1 recall. As a practical contribution, we used VFCFinder to backfill over 300 missing VFCs in the GitHub Security Advisory (GHSA) database. All of the VFCs were accepted and merged into the GHSA database. In addition to demonstrating a practical pairing of security advisories to VFCs, our general open-source implementation will allow vulnerability database maintainers to drastically improve data quality, supporting efforts to secure the software supply chain.more » « less
-
With the rise in threats against the software supply chain, developer integrated development environments (IDEs) present an attractive target for attackers. For example, researchers have found extensions for Visual Studio Code (VS Code) that start web servers and can be exploited via JavaScript executing in a web browser on the developer's host. This paper seeks to systematically understand the landscape of vulnerabilities in VS Code's extension marketplace. We identify a set of four sources of untrusted input and three code targets that can be used for code injection and file integrity attacks and use them to design taint analysis rules in CodeQL. We then perform an ecosystem-level analysis of the VS Code extension marketplace, studying 25,402 extensions that contain code. Our results show that while vulnerabilities are not pervasive, they exist and impact millions of users. Specifically, we find 21 extensions with verified proof of concept exploits of code injection attacks impacting a total of over 6 million installations. Through this study, we demonstrate the need for greater attention to the security of IDE extensions.more » « less
-
Unsolicited bulk telephone calls — termed "robocalls" — nearly outnumber legitimate calls, overwhelming telephone users. While the vast majority of these calls are illegal, they are also ephemeral. Although telephone service providers, regulators, and researchers have ready access to call metadata, they do not have tools to investigate call content at the vast scale required. This paper presents SnorCall, a framework that scalably and efficiently extracts content from robocalls. SnorCall leverages the Snorkel framework that allows a domain expert to write simple labeling functions to classify text with high accuracy. We apply SnorCall to a corpus of transcripts covering 232,723 robocalls collected over a 23-month period. Among many other findings, SnorCall enables us to obtain first estimates on how prevalent different scam and legitimate robocall topics are, determine which organizations are referenced in these calls, estimate the average amounts solicited in scam calls, identify shared infrastructure between campaigns, and monitor the rise and fall of election-related political calls. As a result, we demonstrate how regulators, carriers, anti-robocall product vendors, and researchers can use SnorCall to obtain powerful and accurate analyses of robocall content and trends that can lead to better defenses.more » « less
-
Software depends on upstream projects that regularly fix vulnerabilities, but the documentation of those vulnerabilities is often unreliable or unavailable. Automating the collection of existing vulnerability fixes is essential for downstream projects to reliably update their dependencies due to the sheer number of dependencies in modern software. Prior efforts rely solely on incomplete databases or imprecise or inaccurate statistical analysis of upstream repositories. In this paper, we introduce Differential Alert Analysis (DAA) to discover vulnerability fixes in software projects. In contrast to statistical analysis, DAA leverages static analysis security testing (SAST) tools, which reason over code context and semantics. We provide a language-independent implementation of DAA and show that for Python and Java based projects, DAA has high precision for a ground-truth dataset of vulnerability fixes — even with noisy and low-precision SAST tools. We then use DAA in two large-scale empirical studies covering several prominent ecosystems, finding hundreds of resolved alerts, including many never publicly disclosed. DAA thus provides a powerful, accurate primitive for software projects, code analysis tools, vulnerability databases, and researchers to characterize and enhance the security of software supply chains.more » « less
-
Recent years have shown increased cyber attacks targeting less secure elements in the software supply chain and causing fatal damage to businesses and organizations. Past well-known examples of software supply chain attacks are the SolarWinds or log4j incidents that have affected thousands of customers and businesses. The US government and industry are equally interested in enhancing soft- ware supply chain security. On February 22, 2023, researchers from the NSF-supported Secure Software Supply Chain Center (S3C2) conducted a Secure Software Supply Chain Summit with a diverse set of 17 practitioners from 15 companies. The goal of the Summit is to enable sharing between industry practitioners having practical experiences and challenges with software supply chain security and helping to form new collaborations. We conducted six-panel discussions based upon open-ended questions regarding software bill of materials (SBOMs), malicious commits, choosing new dependencies, build and deploy, the Executive Order 14028, and vulnerable dependencies. The open discussions enabled mutual sharing and shed light on common challenges that industry practitioners with practical experience face when securing their software supply chain. In this paper, we provide a summary of the Summit.more » « less
-
Desktop operating systems, including macOS, Windows 10, and Linux, are adopting the application-based security model pervasive in mobile platforms. In Linux, this transition is part of the movement towards two distribution-independent application platforms: Flatpak and Snap. This paper provides the first analysis of sandbox policies defined for Flatpak and Snap applications, covering 283 applications contained in both platforms. First, we find that 90.1% of Snaps and 58.3% of Flatpak applications studied are contained by tamperproof sandboxes. Further, we find evidence that package maintainers actively attempt to define least-privilege application policies. However, defining policy is difficult and error-prone. When studying the set of matching applications that appear in both Flatpak and Snap app stores, we frequently found policy mismatches: e.g., the Flatpak version has a broad privilege (e.g., file access) that the Snap version does not, or vice versa. This work provides confidence that Flatpak and Snap improve Linux platform security while highlighting opportunities for improvement.more » « less
An official website of the United States government

Full Text Available