Bounded Rationality in Las Vegas: Probabilistic Finite Automata Play Multi-Armed Bandits

Liu, Xinming; Halpern, Joseph Y.

Citation Details

While traditional economics assumes that humans are fully rational agents who always maximize their expected utility, in practice, we constantly observe apparently irrational behavior. One explanation is that people have limited computational power, so that they are, quite rationally, making the best decisions they can, given their computational limitations. To test this hypothesis, we consider the multi-armed bandit (MAB) problem. We examine a simple strategy for playing an MAB that can be implemented easily by a probabilistic finite automaton (PFA). Roughly speaking, the PFA sets certain expectations, and plays an arm as long as it meets them. If the PFA has sufficiently many states, it performs near-optimally. Its performance degrades gracefully as the number of states decreases. Moreover, the PFA acts in a “human-like” way, exhibiting a number of standard human biases, like an optimism bias and a negativity bias. more »

Award ID(s):: 1703846

PAR ID:: 10165587

Author(s) / Creator(s):: Liu, Xinming; Halpern, Joseph Y.

Date Published:: 2020-08-01

Journal Name:: Proceedings of the 36th Conference on Uncertainty in AI (UAI 2020)

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this