skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, October 10 until 2:00 AM ET on Friday, October 11 due to maintenance. We apologize for the inconvenience.


This content will become publicly available on July 12, 2025

Title: GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
Award ID(s):
1937786 2131859
NSF-PAR ID:
10522919
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Annual Meeting of the Association for Computational Linguistics
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
No document suggestions found