Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! | NSF Public Access Repository

skip to main content

An official website of the United States government Here's how you know

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Citation Details

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

Award ID(s):: 2131938

PAR ID:: 10549877

Author(s) / Creator(s):: Qi, Xiangyu; Zeng, Yi; Xie, Tinghao; Chen, Pin-Yu; Jia, Ruoxi; Mittal, Prateek; Henderson, Peter

Publisher / Repository:: ICLR

Date Published:: 2024-05-07

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Proceeding:
The DOI is not currently available.