A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Wei, Honghao; Liu, Xin; Ying, Lei

doi:10.1609/aaai.v36i4.20302

Citation Details

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves sublinear regret and zero constraint violation. The bounds depend on the number of states S, the number of actions A, and two constants which are independent of the learning horizon K. more »

Award ID(s):: 2001687 2002608

NSF-PAR ID:: 10342263

Author(s) / Creator(s):: Wei, Honghao; Liu, Xin; Ying, Lei

Date Published:: 2022-06-30

Journal Name:: Proceedings of the AAAI Conference on Artificial Intelligence

Volume:: 36

Issue:: 4

ISSN:: 2159-5399

Page Range / eLocation ID:: 3868 to 3876

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1609/aaai.v36i4.20302

More Like this