NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cleangen: Mitigating backdoor attacks for generation tasks in large language models

Li, Y; Xu, Z; Jiang, F; Niu, L; Sahabandu, D; Ramasubramanian, B; Poovendran, R (November 2024, Conference on Empirical Methods in Natural Language Processing (EMNLP))

The remarkable performance of large language models (LLMs) in generation tasks has enabled practitioners to leverage publicly available models to power custom applications, such as chatbots and virtual assistants. However, the data used to train or fine-tune these LLMs is often undisclosed, allowing an attacker to compromise the data and inject backdoors into the models. In this paper, we develop a novel inference time defense, named CLEANGEN, to mitigate backdoor attacks for generation tasks in LLMs. CLEANGEN is a lightweight and effective decoding strategy that is compatible with the state-of-the-art (SOTA) LLMs. Our insight behind CLEANGEN is that compared to other LLMs, back doored LLMs assign significantly higher probabilities to tokens representing the attacker-desired contents. These discrepancies in token probabilities enable CLEANGEN to identify suspicious tokens favored by the attacker and replace them with tokens generated by another LLM that is not compromised by the same attacker, thereby avoiding generation of attacker-desired content. We evaluate CLEANGEN against five SOTA backdoor attacks. Our results show that CLEANGEN achieves lower attack success rates (ASR) compared to five SOTA baseline defenses for all five backdoor attacks. Moreover, LLMs deploying CLEANGEN maintain helpfulness in their responses when serving benign user queries with minimal added computational overhead.
more » « less
Full Text Available
A Compositional Approach to Safety-Critical Resilient Control for Systems with Coupled Dynamics

Al Maruf, A.; Niu, L.; Clark, A.; Mertoguno, J.S.; Poovendran, R. (December 2022, IEEE Conference on Decision and Control)

Full Text Available
Secure Control Under Partial Observability with Temporal Logic Constraints

https://doi.org/https://doi.org/10.23919/ACC.2019.8814630

Ramasubramanian, B; Clark, A; Bushnell, L; Poovendran, R. (January 2019, American Control Conference (ACC))

This paper studies the synthesis of control policies for an agent that has to satisfy a temporal logic specification in a partially observable environment, in the presence of an adversary. The interaction of the agent (defender) with the adversary is modeled as a partially observable stochastic game. The search for policies is limited to over the space of finite state controllers, which leads to a tractable approach to determine policies. The goal is to generate a defender policy to maximize satisfaction of a given temporal logic specification under any adversary policy. We relate the satisfaction of the specification in terms of reaching (a subset of) recurrent states of a Markov chain. We then present a procedure to determine a set of defender and adversary finite state controllers of given sizes that will satisfy the temporal logic specification. We illustrate our approach with an example.
more » « less
Full Text Available
Linear Temporal Logic Satisfaction in Adversarial Environments Using Secure Control Barrier Certificates

https://doi.org/https://doi.org/10.1007/978-3-030-32430-8_23

Ramasubramanian, B; Niu, L.; Clark, A; Bushnell, L; Poovendran, R (January 2019, IEEE Conference on Decision and Game Theory for Security)

This paper studies the satisfaction of a class of temporal properties for cyber-physical systems (CPSs) over a finite-time horizon in the presence of an adversary, in an environment described by discretetime dynamics. The temporal logic specification is given in safe−LTLF , a fragment of linear temporal logic over traces of finite length. The interaction of the CPS with the adversary is modeled as a two-player zerosum discrete-time dynamic stochastic game with the CPS as defender. We formulate a dynamic programming based approach to determine a stationary defender policy that maximizes the probability of satisfaction of a safe − LTLF formula over a finite time-horizon under any stationary adversary policy. We introduce secure control barrier certificates (S-CBCs), a generalization of barrier certificates and control barrier certificates that accounts for the presence of an adversary, and use S-CBCs to provide a lower bound on the above satisfaction probability. When the dynamics of the evolution of the system state has a specific underlying structure, we present a way to determine an S-CBC as a polynomial in the state variables using sum-of-squares optimization. An illustrative example demonstrates our approach.
more » « less
Full Text Available

Search for: All records