Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits | NSF Public Access Repository

skip to main content

An official website of the United States government Here's how you know

Official websites use .gov

A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS

A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Citation Details

Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

Award ID(s):: 2107455 2210734

PAR ID:: 10329700

Author(s) / Creator(s):: Guo, Wenshuo; Agrawal, Kumar K.; Grover, Aditya; Muthukumar, Vidya K.; Pananjady, Ashwin

Date Published:: 2022-03-01

Journal Name:: International Conference on Artificial Intelligence and Statistics (AISTATS)

Page Range / eLocation ID:: 6357--6386

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.