skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1946273

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As the reliance on open-source software dependencies increases, managing the security vulnerabilities in these dependencies becomes complex. State-of-the-art industry tools use reachability analysis of code to alert developers when security vulnerabilities in dependencies are likely to impact their projects. These tools heavily rely on precisely identifying the location of the vulnerability within the dependency, specifically vulnerable functions. However, the process of identifying vulnerable functions is currently either manual or uses a naive automated approach that falsely assumes all changed functions in a security patch link are vulnerable. In this paper, we explore using open-source large language models (LLMs) to improve pairing security advisories with vulnerable functions. We explore various prompting strategies, learning paradigms (i.e., zero-shot vs. few-shot), and show our approach generalizes to other open-source LLMs. Compared to the naive automated approach, we show a 173% increase in precision while only having an 18% decrease in recall. The significant increase in precision to enhance vulnerable function identification lays the groundwork for downstream techniques that depend on this critical information for security analysis and threat mitigation. 
    more » « less
  2. Security advisories are the primary channel of communication for discovered vulnerabilities in open-source software, but they often lack crucial information. Specifically, 63% of vulnerability database reports are missing their patch links, also referred to as vulnerability fixing commits (VFCs). This paper introduces VFCFinder, a tool that generates the top-five ranked set of VFCs for a given security advisory using Natural Language Programming Language (NL-PL) models. VFCFinder yields a 96.6% recall for finding the correct VFC within the Top-5 commits, and an 80.0% recall for the Top-1 ranked commit. VFCFinder generalizes to nine different programming languages and outperforms state-of-the-art approaches by 36 percentage points in terms of Top-1 recall. As a practical contribution, we used VFCFinder to backfill over 300 missing VFCs in the GitHub Security Advisory (GHSA) database. All of the VFCs were accepted and merged into the GHSA database. In addition to demonstrating a practical pairing of security advisories to VFCs, our general open-source implementation will allow vulnerability database maintainers to drastically improve data quality, supporting efforts to secure the software supply chain. 
    more » « less
  3. Software depends on upstream projects that regularly fix vulnerabilities, but the documentation of those vulnerabilities is often unreliable or unavailable. Automating the collection of existing vulnerability fixes is essential for downstream projects to reliably update their dependencies due to the sheer number of dependencies in modern software. Prior efforts rely solely on incomplete databases or imprecise or inaccurate statistical analysis of upstream repositories. In this paper, we introduce Differential Alert Analysis (DAA) to discover vulnerability fixes in software projects. In contrast to statistical analysis, DAA leverages static analysis security testing (SAST) tools, which reason over code context and semantics. We provide a language-independent implementation of DAA and show that for Python and Java based projects, DAA has high precision for a ground-truth dataset of vulnerability fixes — even with noisy and low-precision SAST tools. We then use DAA in two large-scale empirical studies covering several prominent ecosystems, finding hundreds of resolved alerts, including many never publicly disclosed. DAA thus provides a powerful, accurate primitive for software projects, code analysis tools, vulnerability databases, and researchers to characterize and enhance the security of software supply chains. 
    more » « less
  4. Desktop operating systems, including macOS, Windows 10, and Linux, are adopting the application-based security model pervasive in mobile platforms. In Linux, this transition is part of the movement towards two distribution-independent application platforms: Flatpak and Snap. This paper provides the first analysis of sandbox policies defined for Flatpak and Snap applications, covering 283 applications contained in both platforms. First, we find that 90.1% of Snaps and 58.3% of Flatpak applications studied are contained by tamperproof sandboxes. Further, we find evidence that package maintainers actively attempt to define least-privilege application policies. However, defining policy is difficult and error-prone. When studying the set of matching applications that appear in both Flatpak and Snap app stores, we frequently found policy mismatches: e.g., the Flatpak version has a broad privilege (e.g., file access) that the Snap version does not, or vice versa. This work provides confidence that Flatpak and Snap improve Linux platform security while highlighting opportunities for improvement. 
    more » « less