Model-based Reinforcement Learning for Parameterized Action Spaces

Zhang, R; Fu, H; Miao, Y; Konidaris, G

Citation Details

We propose a novel model-based reinforcement learning algorithm—Dynamics Learning and predictive control with Parameterized Actions (DLPA)—for Parameterized Action Markov Decision Processes (PAMDPs). The agent learns a parameterized-action-conditioned dynamics model and plans with a modified Model Predictive Path Integral control. We theoretically quantify the difference between the generated trajectory and the optimal trajectory during planning in terms of the value they achieved through the lens of Lipschitz Continuity. Our empirical results on several standard benchmarks show that our algorithm achieves superior sample efficiency and asymptotic performance than state-of-the-art PAMDP methods. more »

Award ID(s):: 1844960 1955361

PAR ID:: 10567027

Author(s) / Creator(s):: Zhang, R; Fu, H; Miao, Y; Konidaris, G

Publisher / Repository:: Proceedings of the 41st International Conference on Machine Learning

Date Published:: 2024-07-01

Format(s):: Medium: X

Location:: Vienna, Austria

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this