NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method

https://doi.org/10.1109/TAI.2022.3215614

Xu, Duo; Fekri, Faramarz (October 2022, IEEE Transactions on Artificial Intelligence)

The actor-critic RL is widely used in various robotic control tasks. By viewing the actor-critic RL from the perspective of variational inference (VI), the policy network is trained to obtain the approximate posterior of actions given the optimality criteria. However, in practice, the actor-critic RL may yield suboptimal policy estimates due to the amortization gap and insufficient exploration. In this work, inspired by the previous use of Hamiltonian Monte Carlo (HMC) in VI, we propose to integrate the policy network of actor-critic RL with HMC, which is termed as Hamiltonian Policy. As such we propose to evolve actions from the base policy according to HMC, and our proposed method has many benefits. First, HMC can improve the policy distribution to better approximate the posterior and hence reduce the amortization gap. Second, HMC can also guide the exploration more to the regions of action spaces with higher Q values, enhancing the exploration efficiency. Further, instead of directly applying HMC into RL, we propose a new leapfrog operator to simulate the Hamiltonian dynamics. Finally, in safe RL problems, we find that the proposed method can not only improve the achieved return, but also reduce safety constraint violations by discarding potentially unsafe actions. With comprehensive empirical experiments on continuous control baselines, including MuJoCo and PyBullet Roboschool, we show that the proposed approach is a data-efficient and easy-to-implement improvement over previous actor-critic methods.
more » « less
Full Text Available
Integrating Symbolic Planning and Reinforcement Learning for Following Temporal Logic Specifications

https://doi.org/10.1109/IJCNN55064.2022.9892304

Xu, Duo; Fekri, Faramarz (July 2022, 2022 International Joint Conference on Neural Networks (IJCNN))

Teaching a deep reinforcement learning (RL) agent to follow instructions in multi-task environments is a challenging problem. We consider that user defines every task by a linear temporal logic (LTL) formula. However, some causal dependencies in complex environments may be unknown to the user in advance. Hence, when human user is specifying instructions, the robot cannot solve the tasks by simply following the given instructions. In this work, we propose a hierarchical reinforcement learning (HRL) framework in which a symbolic transition model is learned to efficiently produce high-level plans that can guide the agent efficiently solve different tasks. Specifically, the symbolic transition model is learned by inductive logic programming (ILP) to capture logic rules of state transitions. By planning over the product of the symbolic transition model and the automaton derived from the LTL formula, the agent can resolve causal dependencies and break a causally complex problem down into a sequence of simpler low-level sub-tasks. We evaluate the proposed framework on three environments in both discrete and continuous domains, showing advantages over previous representative methods.
more » « less
Full Text Available
Accelerating Reinforcement Learning using EEG-based implicit human feedback

https://doi.org/10.1016/j.neucom.2021.06.064

Xu, Duo; Agarwal, Mohit; Gupta, Ekansh; Fekri, Faramarz; Sivakumar, Raghupathy (October 2021, Neurocomputing)

Full Text Available
Blink to Get In: Biometric Authentication for Mobile Devices using EEG Signals

https://doi.org/10.1109/ICC40277.2020.9148741

Gupta, Ekansh; Agarwal, Mohit; Sivakumar, Raghupathy (June 2020, Blink to Get In: Biometric Authentication for Mobile Devices using EEG Signals)

Full Text Available
Charge for a whole day: Extending Battery Life for BCI Wearables using a Lightweight Wake-Up Command

https://doi.org/10.1145/3313831.3376738

Agarwal, Mohit; Sivakumar, Raghupathy (April 2020, Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems)

Full Text Available
Blink: A Fully Automated Unsupervised Algorithm for Eye-Blink Detection in EEG Signals

https://doi.org/10.1109/ALLERTON.2019.8919795

Agarwal, Mohit; Sivakumar, Raghupathy (September 2019, Blink: A Fully Automated Unsupervised Algorithm for Eye-Blink Detection in EEG Signals)

Full Text Available
Cerebro: A Wearable Solution to Detect and Track User Preferences using Brainwaves

https://doi.org/10.1145/3325424.3329660

Agarwal, Mohit; Sivakumar, Raghupathy (June 2019, WearSys '19: The 5th ACM Workshop on Wearable Systems and Applications)

Full Text Available

Search for: All records