NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Analyzing Code Injection Attacks on LLM-based Multi-Agent Systems in Software Development

Bowers, Brian; Khapre, Smita; Kalita, Jugal (December 2025, International Conference on Machine Learning Applications (ICMLA))

Agentic AI and Multi-Agent Systems are poised to dominate industry and society imminently. Powered by goal- driven autonomy, they represent a powerful form of generative AI, marking a transition from reactive content generation into proactive multitasking capabilities. As an exemplar, we propose an architecture of a multi-agent system for the implementation phase of the software engineering process. We also present a comprehensive threat model for the proposed system. We demon- strate that while such systems can generate code quite accurately, they are vulnerable to attacks, including code injection. Due to their autonomous design and lack of humans in the loop, these systems cannot identify and respond to attacks by themselves. This paper analyzes the vulnerability of multi-agent systems and concludes that the coder-reviewer-tester architecture is more resilient than both the coder and coder-tester architectures, but is less efficient at writing code. We find that by adding a security analysis agent, we mitigate the loss in efficiency while achieving even better resiliency. We conclude by demonstrating that the security analysis agent is vulnerable to advanced code injection attacks, showing that embedding poisonous few-shot examples in the injected code can increase the attack success rate from 0% to 71.95%.
more » « less
Free, publicly-accessible full text available December 1, 2026
Solving Math Word Problems Using Estimation Verification and Equation Generation

Piehl, Mitchell; Wilson, Dillon; Kalita, Ananya; Kalita, Jugal (December 2025, International Conference on Machine Learning Applications (ICMLA))

Large Language Models (LLMs) excel at various tasks, including problem-solving and question-answering. How- ever, LLMs often find Math Word Problems (MWPs) chal- lenging because solving them requires a range of reasoning and mathematical abilities with which LLMs seem to struggle. Recent efforts have helped LLMs solve more complex MWPs with improved prompts. This study proposes a novel method that initially prompts an LLM to create equations from a decomposition of the question, followed by using an external symbolic equation solver to produce an answer. To ensure the accuracy of the obtained answer, inspired by an established recommendation of math teachers, the LLM is instructed to solve the MWP a second time, but this time with the objective of estimating the correct answer instead of solving it exactly. The estimation is then compared to the generated answer to verify. If verification fails, an iterative rectification process is employed to ensure the correct answer is eventually found. This approach achieves new state-of-the-art results on datasets used by prior published research on numeric and algebraic MWPs, improving the previous best results by nearly two percent on average. In addition, the approach obtains satisfactory results on trigonometric MWPs, a task not previously attempted to the authors’ best knowledge. This study also introduces two new datasets, SVAMPClean and Trig300, to further advance the testing of LLMs’ reasoning abilities.
more » « less
Free, publicly-accessible full text available December 1, 2026
TinyML for Speech Recognition

https://doi.org/10.1109/COMPSAC65507.2025.00220

Barovic, Andrew; Moin, Armin (July 2025, IEEE)

We train and deploy a quantized 1D convolutional neural network model to conduct speech recognition on a highly resource-constrained IoT edge device. This can be useful in various Internet of Things (IoT) applications, such as smart homes and ambient assisted living for the elderly and people with disabilities, just to name a few examples. In this paper, we first create a new dataset with over one hour of audio data that enables our research and will be useful to future studies in this field. Second, we utilize the technologies provided by Edge Impulse to enhance our model’s performance and achieve a high Accuracy of up to 97% on our dataset. For the validation, we implement our prototype using the Arduino Nano 33 BLE Sense microcontroller board. This microcontroller board is specifically designed for IoT and AI applications, making it an ideal choice for our target use case scenarios. While most existing research focuses on a limited set of keywords, our model can process 23 different keywords, enabling complex commands.
more » « less
Free, publicly-accessible full text available July 8, 2026
Automated Duplicate Bug Report Detection in Large Open Bug Repositories

https://doi.org/10.1109/COMPSAC65507.2025.00065

Laney, Clare E; Barovic, Andrew; Moin, Armin (July 2025, IEEE)

Many users and contributors of large open-source projects report software defects or enhancement requests (known as bug reports) to the issue-tracking systems. However, they sometimes report issues that have already been reported. First, they may not have time to do sufficient research on existing bug reports. Second, they may not possess the right expertise in that specific area to realize that an existing bug report is essentially elaborating on the same matter, perhaps with a different wording. In this paper, we propose a novel approach based on machine learning methods that can automatically detect duplicate bug reports in an open bug repository based on the textual data in the reports. We present six alternative methods: Topic modeling, Gaussian Na¨ıve Bayes, deep learning, time-based organization, clustering, and summarization using a generative pre-trained transformer large language model. Additionally, we introduce a novel threshold-based approach for duplicate identification, in contrast to the conventional top-k selection method that has been widely used in the literature. Our approach demonstrates promising results across all the proposed methods, achieving accuracy rates ranging from the high 70%’s to the low 90%’s. We evaluated our methods on a public dataset of issues belonging to an Eclipse open-source project.
more » « less
Free, publicly-accessible full text available July 8, 2026
Linear Relational Decoding of Morphology in Language Models

Xia, Eric; Kalita, Jugal (March 2025, NAACL, aclanthology.org)

A two-part affine approximation has been found to be a good approximation for trans- former computations over certain subject- object relations. Adapting the Bigger Analogy Test Set, we show that the linear transforma- tion W s, where s is a middle layer representa- tion of a subject token and W is derived from model derivatives, is also able to accurately re- produce final object states for many relations. This linear technique is able to achieve 90% faithfulness on morphological relations, and we show similar findings multi-lingually and across models. Our findings indicate that some conceptual relationships in language models, such as morphology, are readily interpretable from latent space, and are sparsely encoded by cross-layer linear transformations.
more » « less
Free, publicly-accessible full text available March 1, 2026

Search for: All records