Approximate Relative Value Learning for Average-reward Continuous State MDPs

Sharma, Hiteshi; Jafarnia-Jahromi, Mehdi; Jain, Rahul

Citation Details

In this paper, we propose an approximate rela- tive value learning (ARVL) algorithm for non- parametric MDPs with continuous state space and finite actions and average reward criterion. It is a sampling based algorithm combined with kernel density estimation and function approx- imation via nearest neighbors. The theoreti- cal analysis is done via a random contraction operator framework and stochastic dominance argument. This is the first such algorithm for continuous state space MDPs with average re- ward criteria with these provable properties which does not require any discretization of state space as far as we know. We then eval- uate the proposed algorithm on a benchmark problem numerically. more »

Award ID(s):: 1810447 1817212

PAR ID:: 10128113

Author(s) / Creator(s):: Sharma, Hiteshi; Jafarnia-Jahromi, Mehdi; Jain, Rahul

Date Published:: 2019-07-01

Journal Name:: Proceedings UAI

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this