Reinforcement Learning with Depreciating Assets

Dohmen, T.; Trivedi, A.

Citation Details

A basic assumption of traditional reinforcement learning is that the value of a reward does not change once it is received by an agent. The present work forgoes this assumption and considers the situation where the value of a reward decays proportionally to the time elapsed since it was obtained. Emphasizing the inflection point occurring at the time of payment, we use the term asset to refer to a reward that is currently in the possession of an agent. Adopting this language, we initiate the study of depreciating assets within the framework of infinite-horizon quantitative optimization. In particular, we propose a notion of asset depreciation, inspired by classical exponential discounting, where the value of an asset is scaled by a fixed discount factor at each time step after it is obtained by the agent. We formulate an equational characterization of optimality in this context, establish that optimal values and policies can be computed efficiently, and develop a model-free reinforcement learning approach to obtain optimal policies. more »

Award ID(s):: 2146563

PAR ID:: 10419683

Author(s) / Creator(s):: Dohmen, T.; Trivedi, A.

Date Published:: 2023-05-30

Journal Name:: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (AAMAS '23)

Page Range / eLocation ID:: 2628–2630

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this