Background: Software Package Registries (SPRs) are an integral part of the software supply chain. These collaborative platforms unite contributors, users, and packages, and they streamline pack- age management. Much engineering work focuses on synthesizing packages from SPRs into a downstream project. Prior work has thoroughly characterized the SPRs associated with traditional soft- ware, such as NPM (JavaScript) and PyPI (Python). Pre-Trained Model (PTM) Registries are an emerging class of SPR of increasing importance, because they support the deep learning supply chain. Aims: A growing body of empirical research has examined PTM registries from various angles, such as vulnerabilities, reuse processes, and evolution. However, no existing research synthesizes them to provide a systematic understanding of the current knowledge. Furthermore, much of the existing research includes unsupported qualitative claims and lacks sufficient quantitative analysis. Our research aims to fill these gaps by providing a thorough knowledge synthesis and use it to inform further quantitative analysis. Methods: To consolidate existing knowledge on PTM reuse, we first conduct a systematic literature review (SLR). We then observe that some of the claims are qualitative and lack quantitative evidence. We identify quantifiable metrics associated with those claims, and measure in order to substantiate these claims. Results: From our SLR, we identify 12 claims about PTM reuse on the HuggingFace platform, 4 of which lack quantitative validation. We successfully test 3 of these claims through a quantitative analysis, and directly compare one with traditional software. Our findings corroborate qualitative claims with quantitative measurements. Our two most notable findings are: (1) PTMs have a significantly higher turnover rate than traditional software, indicating a dynamic and rapidly evolving reuse environment within the PTM ecosystem; and (2) There is a strong correlation between documentation quality and PTM popularity. Conclusions: Our findings validate several qual- stative research claims with concrete metrics, confirming prior qualitative and case study research. Our measures show further dynamics of PTM reuse, motivating further research infrastructure and new kinds of measurements.
more »
« less
This content will become publicly available on August 1, 2026
ConfuGuard: Using Metadata to Detect Active and Stealthy Package Confusion Attacks Accurately and at Scale
Package confusion attacks such as typosquatting threaten soft- ware supply chains. Attackers make packages with names that syntactically or semantically resemble legitimate ones, trick- ing engineers into installing malware. While prior work has developed defenses against package confusions in some soft- ware package registries, notably NPM, PyPI, and RubyGems, gaps remain: high false-positive rates, generalization to more software package ecosystems, and insights from real-world deployment. In this work, we introduce ConfuGuard, a state-of-art de- tector for package confusion threats. We begin by presenting the first empirical analysis of benign signals derived from prior package confusion data, uncovering their threat patterns, engineering practices, and measurable attributes. Advancing existing detectors, we leverage package metadata to distin- guish benign packages, and extend support from three up to seven software package registries. Our approach significantly reduces false positive rates (from 80% to 28%), at the cost of an additional 14s average latency to filter out benign pack- ages by analyzing the package metadata. ConfuGuard is used in production at our industry partner, whose analysts have already confirmed 630 real attacks detected by ConfuGuard
more »
« less
- PAR ID:
- 10644220
- Publisher / Repository:
- International Conference on Software Engineering (ICSE) 2026
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Background: Software Package Registries (SPRs) are an integral part of the software supply chain. These collaborative platforms unite contributors, users, and packages, and they streamline pack- age management. Much engineering work focuses on synthesizing packages from SPRs into a downstream project. Prior work has thoroughly characterized the SPRs associated with traditional soft- ware, such as NPM (JavaScript) and PyPI (Python). Pre-Trained Model (PTM) Registries are an emerging class of SPR of increasing importance, because they support the deep learning supply chain. Aims: A growing body of empirical research has examined PTM reg- istries from various angles, such as vulnerabilities, reuse processes, and evolution. However, no existing research synthesizes them to provide a systematic understanding of the current knowledge. Furthermore, much of the existing research includes unsupported qualitative claims and lacks sufficient quantitative analysis. Our research aims to fill these gaps by providing a thorough knowledge synthesis and use it to inform further quantitative analysis. Methods: To consolidate existing knowledge on PTM reuse, we first conduct a systematic literature review (SLR). We then observe that some of the claims are qualitative and lack quantitative evi- dence. We identify quantifiable metrics assoiated with those claims, and measure in order to substantiate these claims. Results: From our SLR, we identify 12 claims about PTM reuse on the HuggingFace platform, 4 of which lack quantitative validation. We successfully test 3 of these claims through a quantitative analysis, and directly compare one with traditional software. Our findings corroborate qualitative claims with quantitative measurements. Our two most notable findings are: (1) PTMs have a significantly higher turnover rate than traditional software, indicating a dynamic and rapidly evolving reuse environment within the PTM ecosystem; and (2) There is a strong correlation between documentation quality and PTM popularity. Conclusions: Our findings validate several qual- itative research claims with concrete metrics, confirming prior qualitative and case study research. Our measures show further dynamics of PTM reuse, motivating further research infrastructure and new kinds of measurements.more » « less
-
Modern software installation tools often use packages from more than one repository, presenting a unique set of security challenges. Such a configuration increases the risk of repository compromise and introduces attacks like dependency confusion and repository fallback. In this paper, we offer the first exploration of attacks that specifically target multiple repository update systems, and propose a unique defensive strategy we call articulated trust. Articulated trust is a principle that allows software installation tools to specify trusted developers and repositories for each package. To implement articulated trust, we built Artemis, a framework that introduces several new security techniques, such as per-package prioritization of repositories, multi-role delegations, multiple-repository consensus, and key pinning. These techniques allow for a greater diversity of trust relationships while eliminating the security risk of single points of failure. To evaluate Artemis, we examine attacks on software update systems from the Cloud Native Computing Foundation’s Catalog of Supply Chain Compromises, and find that the most secure configuration of Artemis can prevent all of them, compared to 14-59% for the best existing system. We also cite real-world deployments of Artemis that highlight its practicality. These include the JDF/Linux Foundation Uptane Standard that secures over-the-air updates for millions of automobiles, and TUF, which is used by many companies for secure software distribution.more » « less
-
The integrity of software builds is fundamental to the security of the software supply chain. While Thompson first raised the potential for attacks on build infrastructure in 1984, limited attention has been given to build integrity in the past 40 years, enabling recent attacks on SolarWinds, event-stream, and xz. The best-known defense against build system attacks is creating reproducible builds; however, achieving them can be complex for both technical and social reasons and thus is often viewed as impractical to obtain. In this paper, we analyze reproducibility of builds in a novel context: reusable components distributed as packages in six popular software ecosystems (npm, Maven, PyPI, Go, RubyGems, and Cargo). Our quantitative study on a representative sample of 4000 packages in each ecosystem raises concerns: Rates of reproducible builds vary widely between ecosystems, with some ecosystems having all packages reproducible whereas others have reproducibility issues in nearly every package. However, upon deeper investigation, we identified that with relatively straightforward infrastructure configuration and patching of build tools, we can achieve very high rates of reproducible builds in all studied ecosystems. We conclude that if the ecosystems adopt our suggestions, the build process of published packages can be independently confirmed for nearly all packages without individual developer actions, and doing so will prevent significant future software supply chain attacks.more » « less
-
With the increasing popularity of containerized applications, container registries have hosted millions of repositories that allow developers to store, manage, and share their software. Unfortunately, they have also become a hotbed for adversaries to spread malicious images to the public. In this paper, we present the first in-depth study on the vulnerability of container registries to typosquatting attacks, in which adversaries intentionally upload malicious images with an identification similar to that of a benign image so that users may accidentally download malicious images due to typos. We demonstrate that such typosquatting attacks could pose a serious security threat in both public and private registries as well as across multiple platforms. To shed light on the container registry typosquatting threat, we first conduct a measurement study and a 210-day proof-of-concept exploitation on public container registries, revealing that human users indeed make random typos and download unwanted container images. We also systematically investigate attack vectors on private registries and reveal that its naming space is open and could be easily exploited for launching a typosquatting attack. In addition, for a typosquatting attack across multiple platforms, we demonstrate that adversaries can easily self-host malicious registries or exploit existing container registries to manipulate repositories with similar identifications. Finally, we propose CRYSTAL, a lightweight extension to existing image management, which effectively defends against typosquatting attacks from both container users and registries.more » « less
An official website of the United States government
