Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
- Award ID(s):
- 2131938
- PAR ID:
- 10549877
- Publisher / Repository:
- ICLR
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
No document suggestions found