This content will become publicly available on May 16, 2026
Preference Poisoning Attacks on Reward Model Learning
An official website of the United States government
This content will become publicly available on May 16, 2026