The Expected-Length Model of Options

Abel, David; Winder, John; desJardins, Marie; Littman, Michael L.

doi:10.24963

Citation Details

The Expected-Length Model of Options

Effective options can make reinforcement learning easier by enhancing an agent's ability to both explore in a targeted manner and plan further into the future. However, learning an appropriate model of an option's dynamics in hard, requiring estimating a highly parameterized probability distribution. This paper introduces and motivates the Expected-Length Model (ELM) for options, an alternate model for transition dynamics. We prove ELM is a (biased) estimator of the traditional Multi-Time Model (MTM), but provide a non-vacuous bound on their deviation. We further prove that, in stochastic shortest path problems, ELM induces a value function that is sufficiently similar to the one induced by MTM, and is thus capable of supporting near-optimal behavior. We explore the practical utility of this option model experimentally, finding consistent support for the thesis that ELM is a suitable replacement for MTM. In some cases, we find ELM leads to more sample efficient learning, especially when options are arranged in a hierarchy. more »

Award ID(s):: 1637937

PAR ID:: 10113326

Author(s) / Creator(s):: Abel, David; Winder, John; desJardins, Marie; Littman, Michael L.

Date Published:: 2019-08-14

Journal Name:: IJCAI

ISSN:: 1045-0823

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.24963

More Like this