Relaxed Equilibria for Time-Inconsistent Markov Decision Processes

Bayraktar, Erhan; Huang, Yu-Jui; Wang, Zhenhua; Zhou, Zhou

doi:10.1287/moor.2023.0209

Citation Details

Relaxed Equilibria for Time-Inconsistent Markov Decision Processes

This paper considers an infinite-horizon Markov decision process (MDP) that allows for general nonexponential discount functions in both discrete and continuous time. Because of the inherent time inconsistency, we look for a randomized equilibrium policy (i.e., relaxed equilibrium) in an intrapersonal game between an agent’s current and future selves. When we modify the MDP by entropy regularization, a relaxed equilibrium is shown to exist by a nontrivial entropy estimate. As the degree of regularization diminishes, the entropy-regularized MDPs approximate the original MDP, which gives the general existence of a relaxed equilibrium in the limit by weak convergence arguments. As opposed to prior studies that consider only deterministic policies, our existence of an equilibrium does not require any convexity (or concavity) of the controlled transition probabilities and reward function. Interestingly, this benefit of considering randomized policies is unique to the time-inconsistent case. more »

Award ID(s):: 2109002

PAR ID:: 10636081

Author(s) / Creator(s):: Bayraktar, Erhan; Huang, Yu-Jui; Wang, Zhenhua; Zhou, Zhou

Publisher / Repository:: Institute for Operations Research and the Management Sciences (INFORMS)

Date Published:: 2024-10-23

Journal Name:: Mathematics of Operations Research

ISSN:: 0364-765X

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1287/moor.2023.0209

More Like this