An Impossibility Result in Automata-Theoretic Reinforcement Learning.

Hahn, Ernst Moritz; Perez, Mateo; Schewe, Sven; Somenzi, Fabio; Trivedi, Ashutosh; Wojtczak, Dominik

doi:10.1007/978-3-031-19992-9_3

Citation Details

An Impossibility Result in Automata-Theoretic Reinforcement Learning.

The expanding role of reinforcement learning (RL) in safety-critical system design has promoted omega-automata as a way to express learning requirements—often non-Markovian—with greater ease of expression and interpretation than scalar reward signals. When 𝜔-automata were first proposed in model-free RL, deterministic Rabin acceptance conditions were used in an attempt to provide a direct translation from omega-automata to finite state “reward” machines defined over the same automaton structure (a memoryless reward translation). While these initial attempts to provide faithful, memoryless reward translations for Rabin acceptance conditions remained unsuccessful, translations were discovered for other acceptance conditions such as suitable, limit-deterministic Buechi acceptance or more generally, good-for-MDP Buechi acceptance conditions. Yet, the question “whether a memoryless translation of Rabin conditions to scalar rewards exists” remained unresolved. This paper presents an impossibility result implying that any attempt to use Rabin automata directly (without extra memory) for model-free RL is bound to fail. To establish this result, we show a link between a class of automata enabling memoryless reward translation to closure properties of its accepting and rejecting infinity sets, and to the insight that both the property and its complement need to allow for positional strategies for such an approach to work. We believe that such impossibility results will provide foundations for the application of RL to safety-critical systems. more »

Award ID(s):: 2009022 2146563

PAR ID:: 10417927

Author(s) / Creator(s):: Hahn, Ernst Moritz; Perez, Mateo; Schewe, Sven; Somenzi, Fabio; Trivedi, Ashutosh; Wojtczak, Dominik

Editor(s):: Bouajjani, A.; Holík, L.; Wu, Z.

Date Published:: 2022-10-21

Journal Name:: International Symposium on Automated Technology for Verification and Analysis (ATVA 2022)

Volume:: 13505

Page Range / eLocation ID:: 42-57

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript
Conference Paper:
https://doi.org/10.1007/978-3-031-19992-9_3

More Like this