NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ConfuGuard: Using Metadata to Detect Active and Stealthy Package Confusion Attacks Accurately and at Scale

Jiang, Wenxin; Çakar, Berk; Lysenko, Mikola; Davis, James C (August 2025, International Conference on Software Engineering (ICSE) 2026)

Package confusion attacks such as typosquatting threaten soft- ware supply chains. Attackers make packages with names that syntactically or semantically resemble legitimate ones, trick- ing engineers into installing malware. While prior work has developed defenses against package confusions in some soft- ware package registries, notably NPM, PyPI, and RubyGems, gaps remain: high false-positive rates, generalization to more software package ecosystems, and insights from real-world deployment. In this work, we introduce ConfuGuard, a state-of-art de- tector for package confusion threats. We begin by presenting the first empirical analysis of benign signals derived from prior package confusion data, uncovering their threat patterns, engineering practices, and measurable attributes. Advancing existing detectors, we leverage package metadata to distin- guish benign packages, and extend support from three up to seven software package registries. Our approach significantly reduces false positive rates (from 80% to 28%), at the cost of an additional 14s average latency to filter out benign pack- ages by analyzing the package metadata. ConfuGuard is used in production at our industry partner, whose analysts have already confirmed 630 real attacks detected by ConfuGuard
more » « less
Free, publicly-accessible full text available August 1, 2026
“I see models being a whole other thing”: an empirical study of pre-trained model naming conventions and a tool for enhancing naming consistency

https://doi.org/10.1007/s10664-025-10711-4

Jiang, Wenxin; Kim, Mingyu; Cheung, Chingwo; Kim, Heesoo; Thiruvathukal, George_K; Davis, James_C (August 2025, Empirical Software Engineering)

Abstract As innovation in deep learning continues, many engineers are incorporating Pre-Trained Models (PTMs) as components in computer systems. Some PTMs are foundation models, and others are fine-tuned variations adapted to different needs. When these PTMs are named well, it facilitates model discovery and reuse. However, prior research has shown that model names are not always well chosen and can sometimes be inaccurate and misleading. The naming practices for PTM packages have not been systematically studied, which hampers engineers’ ability to efficiently search for and reliably reuse these models. In this paper, we conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We begin by reporting on a survey of 108 Hugging Face users, highlighting differences from traditional software package naming and presenting findings on PTM naming practices. The survey results indicate a mismatch between engineers’ preferences and current practices in PTM naming. We then introduce DARA, the first automatedDNNARchitectureAssessment technique designed to detect PTM naming inconsistencies. Our results demonstrate that architectural information alone is sufficient to detect these inconsistencies, achieving an accuracy of 94% in identifying model types and promising performance (over 70%) in other architectural metadata as well. We also highlight potential use cases for automated naming tools, such as model validation, PTM metadata generation and verification, and plagiarism detection. Our study provides a foundation for automating naming inconsistency detection. Finally, we envision future work focusing on automated tools for standardizing package naming, improving model selection and reuse, and strengthening the security of the PTM supply chain.“The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer”—D. Knuth
more » « less
Large Language Models for Energy-Efficient Code: Emerging Results and Future Directions

Peng, Huiyun; Gupte, Arjun; Eliopoulos, Nicholas J; Ho, Chien‑Chou; Mantri, Rishi; Deng, Leo; Jiang, Wenxin; Lu, Yung‑Hsiang; Läufer, Konstantin; Thiruvathukal, George K; et al (October 2024, arXiv)

Energy-efficient software helps improve mobile de- vice experiences and reduce the carbon footprint of data centers. However, energy goals are often de-prioritized in order to meet other requirements. We take inspiration from recent work exploring the use of large language models (LLMs) for different software engineering activities. We propose a novel application of LLMs: as code optimizers for energy efficiency. We describe and evaluate a prototype, finding that over 6 small programs our system can improve energy efficiency in 3 of them, up to 2x better than compiler optimizations alone. From our experience, we identify some of the challenges of energy-efficient LLM code optimization and propose a research agenda.
more » « less
Full Text Available
Reusing Deep Learning Models: Challenges and Directions in Software Engineering

https://doi.org/10.1109/JVA60410.2023.00015

Davis, James C; Jajal, Purvish; Jiang, Wenxin; Schorlemmer, Taylor R; Synovic, Nicholas; Thiruvathukal, George K (July 2023, IEEE)

Full Text Available
PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages

https://doi.org/10.1109/MSR59073.2023.00021

Jiang, Wenxin; Synovic, Nicholas; Jajal, Purvish; Schorlemmer, Taylor R; Tewari, Arav; Pareek, Bhavesh; Thiruvathukal, George K; Davis, James C (May 2023, IEEE)
An Empirical Study of Pre-Trained Model Reuse in the Hugging Face Deep Learning Model Registry

https://doi.org/10.1109/ICSE48619.2023.00206

Jiang, Wenxin; Synovic, Nicholas; Hyatt, Matt; Schorlemmer, Taylor R; Sethi, Rohan; Lu, Yung-Hsiang; Thiruvathukal, George K; Davis, James C (May 2023, IEEE)
Interoperability in Deep Learning: A User Survey and Failure Analysis of ONNX Model Converters

https://doi.org/10.1145/3650212.3680374

Jajal, Purvish; Jiang, Wenxin; Tewari, Arav; Kocinare, Erik; Woo, Joseph; Sarraf, Anusha; Lu, Yung-Hsiang; Thiruvathukal, George K; Davis, James C (September 2024, ACM -- ISSTA)

Full Text Available
Discrepancies among pre-trained deep neural networks: a new threat to model zoo reliability

https://doi.org/10.1145/3540250.3560881

Montes, Diego; Peerapatanapokin, Pongpatapee; Schultz, Jeff; Guo, Chengjun; Jiang, Wenxin; Davis, James C. (November 2022, ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) — Ideas, Visions and Reflections Track)

Training deep neural networks (DNNs) takes significant time and resources. A practice for expedited deployment is to use pre-trained deep neural networks (PTNNs), often from model zoosÐcollections of PTNNs; yet, the reliability of model zoos remains unexamined. In the absence of an industry standard for the implementation and performance of PTNNs, engineers cannot confidently incorporate them into production systems. As a first step, discovering potential discrepancies between PTNNs across model zoos would reveal a threat to model zoo reliability. Prior works indicated existing variances in deep learning systems in terms of accuracy. However, broader measures of reliability for PTNNs from model zoos are unexplored. This work measures notable discrepancies between accuracy, latency, and architecture of 36 PTNNs across four model zoos. Among the top 10 discrepancies, we find differences of 1.23%ś2.62% in accuracy and 9%ś131% in latency. We also find mismatches in architecture for well-known DNN architectures (e.g., ResNet and AlexNet). Our findings call for future works on empirical validation, automated tools for measurement, and best practices for implementation.
more » « less
Full Text Available
An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain

https://doi.org/10.1145/3560835.3564547

Jiang, Wenxin; Synovic, Nicholas; Sethi, Rohan; Indarapu, Aryan; Hyatt, Matt; Schorlemmer, Taylor R.; Thiruvathukal, George K.; Davis, James C. (November 2022, Proceedings of the 1st ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses (SCORED)

Deep neural networks achieve state-of-the-art performance on many tasks, but require increasingly complex architectures and costly training procedures. Engineers can reduce costs by reusing a pre-trained model (PTM) and fine-tuning it for their own tasks. To facilitate software reuse, engineers collaborate around model hubs, collections of PTMs and datasets organized by problem domain. Although model hubs are now comparable in popularity and size to other software ecosystems, the associated PTM supply chain has not yet been examined from a software engineering perspective. We present an empirical study of artifacts and security features in 8 model hubs. We indicate the potential threat models and show that the existing defenses are insufficient for ensuring the security of PTMs. We compare PTM and traditional supply chains, and propose directions for further measurements and tools to increase the reliability of the PTM supply chain.
more » « less
Full Text Available
Establishing Trust in Vehicle-to-Vehicle Coordination: A Sensor Fusion Approach

https://doi.org/10.1109/DI-CPS56137.2022.00008

Veselsky, Jakob; West, Jack; Ahlgren, Isaac; Thiruvathukal, George K.; Klingensmith, Neil; Goel, Abhinav; Jiang, Wenxin; Davis, James C.; Lee, Kyuin; Kim, Younghyun (May 2022, 2022 2nd Workshop on Data-Driven and Intelligent Cyber-Physical Systems for Smart Cities Workshop (DI-CPS))

Autonomous vehicles (AVs) use diverse sensors to understand their surroundings as they continually make safety-critical decisions. However, establishing trust with other AVs is a key prerequisite because safety-critical decisions cannot be made based on data shared from untrusted sources. Existing protocols require an infrastructure network connection and a third-party root of trust to establish a secure channel, which are not always available.In this paper, we propose a sensor-fusion approach for mobile trust establishment, which combines GPS and visual data. The combined data forms evidence that one vehicle is nearby another, which is a strong indication that it is not a remote adversary hence trustworthy. Our preliminary experiments show that our sensor-fusion approach achieves above 80% successful pairing of two legitimate vehicles observing the same object with 5 meters of error. Based on these preliminary results, we anticipate that a refined approach can support fuzzy trust establishment, enabling better collaboration between nearby AVs.
more » « less
Full Text Available

« Prev Next »

Search for: All records