skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: On Trust-aware Assistance-seeking in Human-Supervised Autonomy
Using the context of human-supervised object collection tasks, we explore policies for a robot to seek assistance from a human supervisor and avoid loss of human trust in the robot. We consider a human-robot interaction scenario in which a mobile manipulator chooses to collect objects either autonomously or through human assistance; while the human supervisor monitors the robot’s operation, assists when asked, or intervenes if the human perceives that the robot may not accomplish its goal. We design an optimal assistance-seeking policy for the robot using a Partially Observable Markov Decision Process (POMDP) setting in which human trust is a hidden state and the objective is to maximize collaborative performance. We conduct two sets of human-robot interaction experiments. The data from the first set of experiments is used to estimate POMDP parameters, which are used to compute an optimal assistance-seeking policy that is used in the second experiment. For most participants, the estimated POMDP reveals that humans are more likely to intervene when their trust is low and the robot is performing a high-complexity task; and that the robot asking for assistance in high-complexity tasks can increase human trust in the robot. Our experimental results show that the proposed trust-aware policy yields superior performance compared with an optimal trust-agnostic policy.  more » « less
Award ID(s):
2024649
PAR ID:
10445563
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
American Control Conference
Page Range / eLocation ID:
3901 to 3906
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Using a dual-task paradigm, we explore how robot actions, performance, and the introduction of a secondary task influence human trust and engagement. In our study, a human supervisor simultaneously engages in a target-tracking task while supervising a mobile manipulator performing an object collection task. The robot can either autonomously collect the object or ask for human assistance. The human supervisor also has the choice to rely on or interrupt the robot. Using data from initial experiments, we model the dynamics of human trust and engagement using a linear dynamical system (LDS). Furthermore, we develop a human action model to define the probability of human reliance on the robot. Our model suggests that participants are more likely to interrupt the robot when their trust and engagement are low during high-complexity collection tasks. Using Model Predictive Control (MPC), we design an optimal assistance-seeking policy. Evaluation experiments demonstrate the superior performance of the MPC policy over the baseline policy for most participants. 
    more » « less
  2. One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techniques can be tedious for human supervisors, add significant computation burden, and may visit dangerous states during training. We propose an off-policy approach that injects noise into the supervisor’s policy while demonstrating. This forces the supervisor to demonstrate how to recover from errors. We propose a new algorithm, DART (Disturbances for Augmenting Robot Trajectories), that collects demonstrations with injected noise, and optimizes the noise level to approximate the error of the robot’s trained policy during data collection. We compare DART with DAgger and Behavior Cloning in two domains: in simulation with an algorithmic supervisor on the MuJoCo tasks (Walker, Humanoid, Hopper, Half-Cheetah) and in physical experiments with human supervisors training a Toyota HSR robot to perform grasping in clutter. For high dimensional tasks like Humanoid, DART can be up to 3x faster in computation time and only decreases the supervisor’s cumulative reward by 5% during training, whereas DAgger executes policies that have 80% less cumulative reward than the supervisor. On the grasping in clutter task, DART obtains on average a 62% performance increase over Behavior Cloning. 
    more » « less
  3. Recent work has considered personalized route planning based on user profiles, but none of it accounts for human trust. We argue that human trust is an important factor to consider when planning routes for automated vehicles. This article presents a trust-based route-planning approach for automated vehicles. We formalize the human-vehicle interaction as a partially observable Markov decision process (POMDP) and model trust as a partially observable state variable of the POMDP, representing the human’s hidden mental state. We build data-driven models of human trust dynamics and takeover decisions, which are incorporated in the POMDP framework, using data collected from an online user study with 100 participants on the Amazon Mechanical Turk platform. We compute optimal routes for automated vehicles by solving optimal policies in the POMDP planning and evaluate the resulting routes via human subject experiments with 22 participants on a driving simulator. The experimental results show that participants taking the trust-based route generally reported more positive responses in the after-driving survey than those taking the baseline (trust-free) route. In addition, we analyze the trade-offs between multiple planning objectives (e.g., trust, distance, energy consumption) via multi-objective optimization of the POMDP. We also identify a set of open issues and implications for real-world deployment of the proposed approach in automated vehicles. 
    more » « less
  4. This paper presents a framework to learn the reward function underlying high-level sequential tasks from demonstrations. The purpose of reward learning, in the context of learning from demonstration (LfD), is to generate policies that mimic the demonstrator’s policies, thereby enabling imitation learning. We focus on a human-robot interaction(HRI) domain where the goal is to learn and model structured interactions between a human and a robot. Such interactions can be modeled as a partially observable Markov decision process (POMDP) where the partial observability is caused by uncertainties associated with the ways humans respond to different stimuli. The key challenge in finding a good policy in such a POMDP is determining the reward function that was observed by the demonstrator. Existing inverse reinforcement learning(IRL) methods for POMDPs are computationally very expensive and the problem is not well understood. In comparison, IRL algorithms for Markov decision process (MDP) are well defined and computationally efficient. We propose an approach of reward function learning for high-level sequential tasks from human demonstrations where the core idea is to reduce the underlying POMDP to an MDP and apply any efficient MDP-IRL algorithm. Our extensive experiments suggest that the reward function learned this way generates POMDP policies that mimic the policies of the demonstrator well. 
    more » « less
  5. Robots are increasingly being employed for diverse applications where they must work and coexist with humans. The trust in human–robot collaboration (HRC) is a critical aspect of any shared-task performance for both the human and the robot. The study of a human-trusting robot has been investigated by numerous researchers. However, a robot-trusting human, which is also a significant issue in HRC, is seldom explored in the field of robotics. Motivated by this gap, we propose a novel trust-assist framework for human–robot co-carry tasks in this study. This framework allows the robot to determine a trust level for its human co-carry partner. The calculations of this trust level are based on human motions, past interactions between the human–robot pair, and the human’s current performance in the co-carry task. The trust level between the human and the robot is evaluated dynamically throughout the collaborative task, and this allows the trust to change if the human performs false positive actions, which can help the robot avoid making unpredictable movements and causing injury to the human. Additionally, the proposed framework can enable the robot to generate and perform assisting movements to follow human-carrying motions and paces when the human is considered trustworthy in the co-carry task. The results of our experiments suggest that the robot effectively assists the human in real-world collaborative tasks through the proposed trust-assist framework. 
    more » « less