Omega-Regular Decision Processes

Hahn, E M; Perez, M; Schewe, S; Somenzi, F; Trivedi, A; Wojtczak, D

Citation Details

Regular decision processes (RDPs) are a subclass of non- Markovian decision processes where the transition and reward functions are guarded by some regular property of the past (a lookback). While RDPs enable intuitive and succinct rep- resentation of non-Markovian decision processes, their ex- pressive power coincides with finite-state Markov decision processes (MDPs). We introduce omega-regular decision pro- cesses (ODPs) where the non-Markovian aspect of the transi- tion and reward functions are extended to an ω-regular looka- head over the system evolution. Semantically, these looka- heads can be considered as promises made by the decision maker or the learning agent about her future behavior. In par- ticular, we assume that if the promised lookaheads are not fulfilled, then the decision maker receives a payoff of ⊥ (the least desirable payoff), overriding any rewards collected by the decision maker. We enable optimization and learning for ODPs under the discounted-reward objective by reducing them to lexicographic optimization and learning over finite MDPs. We present experimental results demonstrating the effectiveness of the proposed reduction. more »

Award ID(s):: 2146563 2009022

PAR ID:: 10526018

Author(s) / Creator(s):: Hahn, E M; Perez, M; Schewe, S; Somenzi, F; Trivedi, A; Wojtczak, D

Publisher / Repository:: The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Date Published:: 2024-04-20

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this