Multi-Armed Bandits for Human-Machine Decision Making

Reverdy, Paul; Srivastava, Vaibhav

doi:10.1109/ICASSP.2018.8461843

Citation Details

Multi-Armed Bandits for Human-Machine Decision Making

Building an integrated human-machine decision-making system requires developing effective interfaces between the human and the machine. We develop such an interface by studying the multi-armed bandit problem, a simple sequential decision-making paradigm that can model a variety of tasks. We construct Bayesian algorithms for the multi-armed ban- dit problem, prove conditions under which these algorithms achieve good performance, and empirically show that, with appropriate priors, these algorithms effectively model human choice behavior; the priors then form a principled interface from human to machine. We take a signal processing perspective on the prior estimation problem and develop methods to estimate the priors given human choice data. more »

Award ID(s):: 1734272

PAR ID:: 10108283

Author(s) / Creator(s):: Reverdy, Paul; Srivastava, Vaibhav

Date Published:: 2018-04-01

Journal Name:: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Page Range / eLocation ID:: 6986 to 6990

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1109/ICASSP.2018.8461843

More Like this