I. Farkaˇs et al.
(Ed.)
We propose a new deep deterministic actor-critic algorithm with an integrated network architecture and an integrated objective func- tion. We address stabilization of the learning procedure via a novel adap- tive objective that roughly ensures keeping the actor unchanged while the critic makes large errors. We reduce the number of network parame- ters and propose an improved exploration strategy over bounded action spaces. Moreover, we incorporate some recent advances in deep learn- ing to our algorithm. Experiments illustrate that our algorithm speeds up the learning process and reduces the sample complexity considerably over the state-of-the-art algorithms including TD3, SAC, PPO, and A2C in continuous control tasks.
more »
« less
An official website of the United States government

