Bayesian Reinforcement Learning in Factored POMDPs

Katt, Sammie; Oliehoek, Frans A.; Amato, Christopher

Citation Details

Model-based Bayesian Reinforcement Learning (BRL) provides a principled solution to dealing with the exploration-exploitation trade-off, but such methods typically assume a fully observable environments. The few Bayesian RL methods that are applicable in partially observable domains, such as the Bayes-Adaptive POMDP (BA-POMDP), scale poorly. To address this issue, we introduce the Factored BA-POMDP model (FBA-POMDP), a framework that is able to learn a compact model of the dynamics by exploiting the underlying structure of a POMDP. The FBA-POMDP framework casts the problem as a planning task, for which we adapt the Monte-Carlo Tree Search planning algorithm and develop a belief tracking method to approximate the joint posterior over the state and model variables. Our empirical results show that this method outperforms a number of BRL baselines and is able to learn efficiently when the factorization is known, as well as learn both the factorization and the model parameters simultaneously. more »

Award ID(s):: 1734497

PAR ID:: 10098819

Author(s) / Creator(s):: Katt, Sammie; Oliehoek, Frans A.; Amato, Christopher

Date Published:: 2019-01-01

Journal Name:: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems

ISSN:: 1548-8403

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this