Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human–Robot Interaction

Mehta, Shaunak A; Losey, Dylan P

doi:10.1145/3623384

Citation Details

Unified Learning from Demonstrations, Corrections, and Preferences during Physical Human–Robot Interaction

Humans can leverage physical interaction to teach robot arms. This physical interaction takes multiple forms depending on the task, the user, and what the robot has learned so far. State-of-the-art approaches focus on learning from a single modality, or combine some interaction types. Some methods do so by assuming that the robot has prior information about the features of the task and the reward structure. By contrast, in this article, we introduce an algorithmic formalism that unites learning from demonstrations, corrections, and preferences. Our approach makes no assumptions about the tasks the human wants to teach the robot; instead, we learn a reward model from scratch by comparing the human’s input to nearby alternatives, i.e., trajectories close to the human’s feedback. We first derive a loss function that trains an ensemble of reward models to match the human’s demonstrations, corrections, and preferences. The type and order of feedback is up to the human teacher: We enable the robot to collect this feedback passively or actively. We then apply constrained optimization to convert our learned reward into a desired robot trajectory. Through simulations and a user study, we demonstrate that our proposed approach more accurately learns manipulation tasks from physical human interaction than existing baselines, particularly when the robot is faced with new or unexpected objectives. Videos of our user study are available at https://youtu.be/FSUJsTYvEKU more »

Award ID(s):: 2129201

PAR ID:: 10567707

Author(s) / Creator(s):: Mehta, Shaunak A; Losey, Dylan P

Publisher / Repository:: ACM Transactions on Human-Robot Interaction

Date Published:: 2024-09-30

Journal Name:: ACM Transactions on Human-Robot Interaction

Volume:: 13

Issue:: 3

ISSN:: 2573-9522

Page Range / eLocation ID:: 1 to 25

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3623384

More Like this