Learning Optimal Advantage from Preferences and Mistaking it for Reward.

Knox, A; Hatgis-Kessell, S; Adalgeirsson, S; Booth, S; Dragan, A; Stone, P; Niekum, S

doi:10.1609/aaai.v38i9.28870

skip to main content

An official website of the United States government Here's how you know

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.

Citation Details

Learning Optimal Advantage from Preferences and Mistaking it for Reward.

Award ID(s):: 2323384 1749204

PAR ID:: 10495509

Author(s) / Creator(s):: Knox, A; Hatgis-Kessell, S; Adalgeirsson, S; Booth, S; Dragan, A; Stone, P; Niekum, S

Publisher / Repository:: AAAI

Date Published:: 2024-02-01

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

ISSN:: 2159-5399

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1609/aaai.v38i9.28870

More Like this