Shared autonomy provides an effective framework for human-robot collaboration that takes advantage of the complementary strengths of humans and robots to achieve common goals. Many existing approaches to shared autonomy make restrictive assumptions that the goal space, environment dynamics, or human policy are known a priori, or are limited to discrete action spaces, preventing those methods from scaling to complicated real world environments. We propose a model-free, residual policy learning algorithm for shared autonomy that alleviates the need for these assumptions. Our agents are trained to minimally adjust the human’s actions such that a set of goal-agnostic constraints are satisfied. We test our method in two continuous control environments: Lunar Lander, a 2D flight control domain, and a 6-DOF quadrotor reaching task. In experiments with human and surrogate pilots, our method significantly improves task performance without any knowledge of the human’s goal beyond the constraints. These results highlight the ability of model-free deep reinforcement learning to realize assistive agents suited to continuous control settings with little knowledge of user intent. 
                        more » 
                        « less   
                    
                            
                            Aligning Learning with Communication in Shared Autonomy
                        
                    
    
            Assistive robot arms can help humans by partially automating their desired tasks. Consider an adult with motor impairments controlling an assistive robot arm to eat dinner. The robot can reduce the number of human inputs — and how precise those inputs need to be — by recognizing what the human wants (e.g., a fork) and assisting for that task (e.g., moving towards the fork). Prior research has largely focused on learning the human’s task and providing meaningful assistance. But as the robot learns and assists, we also need to ensure that the human understands the robot’s intent (e.g., does the human know the robot is reaching for a fork?). In this paper, we study the effects of communicating learned assistance from the robot back to the human operator. We do not focus on the specific interfaces used for communication. Instead, we develop experimental and theoretical models of a) how communication changes the way humans interact with assistive robot arms, and b) how robots can harness these changes to better align with the human’s intent. We first conduct online and in-person user studies where participants operate robots that provide partial assistance, and we measure how the human’s inputs change with and without communication. With communication, we find that humans are more likely to intervene when the robot incorrectly predicts their intent, and more likely to release control when the robot correctly understands their task. We then use these findings to modify an established robot learning algorithm so that the robot can correctly interpret the human’s inputs when communication is present. Our results from a second in-person user study suggest that this combination of communication and learning outperforms assistive systems that isolate either learning or communication. See videos here: https://youtu.be/BET9yuVTVU4 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2129201
- PAR ID:
- 10567709
- Publisher / Repository:
- IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Date Published:
- ISSN:
- 2153-0866
- ISBN:
- 979-8-3503-7770-5
- Page Range / eLocation ID:
- 11530 to 11536
- Format(s):
- Medium: X
- Location:
- Abu Dhabi, United Arab Emirates
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Affective movement will likely be an important component of robotic interaction as more and more robots move into human-facing scenarios where humans are (consciously or unconsciously) constantly monitoring the motion profile of counterparts in order to make judgments about the state of their counterpart. Many current studies in affective movement recognition and generation seek to either increase a machine’s ability to correctly identify human affect or to identify and create components of robotic movement that enhance human perception. However, very few of these studies investigate the influence of environmental context on a machine’s ability to correctly identity human affect or a human’s ability to correctly identify the affective intent of a robot. This paper presents the results of a user study that investigated how human perception of stylized walking sequences (created in [1]) varied based on the environment where they were portrayed. The results show that environment context can impact a person’s ability to correctly perceive the intended style of a movement.more » « less
- 
            Robots should personalize how they perform tasks to match the needs of individual human users. Today’s robots achieve this personalization by asking for the human’s feedback in the task space. For example, an autonomous car might show the human two different ways to decelerate at stoplights, and ask the human which of these motions they prefer. This current approach to personalization isindirect: Based on the behaviors the human selects (e.g., decelerating slowly), the robot tries to infer their underlying preference (e.g., defensive driving). By contrast, our article develops a learning and interface-based approach that enables humans todirectlyindicate their desired style. We do this by learning an abstract, low-dimensional, and continuous canonical space from human demonstration data. Each point in the canonical space corresponds to a different style (e.g., defensive or aggressive driving), and users can directly personalize the robot’s behavior by simply clicking on a point. Given the human’s selection, the robot then decodes this canonical style across each task in the dataset—e.g., if the human selects a defensive style, the autonomous car personalizes its behavior to drive defensively when decelerating, passing other cars, or merging onto highways. We refer to our resulting approach as PECAN:Personalizing Robot Behaviors through a LearnedCanonical Space. Our simulations and user studies suggest that humans prefer using PECAN to directly personalize robot behavior (particularly when those users become familiar with PECAN), and that users find the learned canonical space to be intuitive and consistent. See videos here:https://youtu.be/wRJpyr23PKI.more » « less
- 
            Humans can leverage physical interaction to teach robot arms. This physical interaction takes multiple forms depending on the task, the user, and what the robot has learned so far. State-of-the-art approaches focus on learning from a single modality, or combine some interaction types. Some methods do so by assuming that the robot has prior information about the features of the task and the reward structure. By contrast, in this article, we introduce an algorithmic formalism that unites learning from demonstrations, corrections, and preferences. Our approach makes no assumptions about the tasks the human wants to teach the robot; instead, we learn a reward model from scratch by comparing the human’s input to nearby alternatives, i.e., trajectories close to the human’s feedback. We first derive a loss function that trains an ensemble of reward models to match the human’s demonstrations, corrections, and preferences. The type and order of feedback is up to the human teacher: We enable the robot to collect this feedback passively or actively. We then apply constrained optimization to convert our learned reward into a desired robot trajectory. Through simulations and a user study, we demonstrate that our proposed approach more accurately learns manipulation tasks from physical human interaction than existing baselines, particularly when the robot is faced with new or unexpected objectives. Videos of our user study are available at https://youtu.be/FSUJsTYvEKUmore » « less
- 
            Shared autonomy enables robots to infer user intent and assist in accomplishing it. But when the user wants to do a new task that the robot does not know about, shared autonomy will hinder their performance by attempting to assist them with something that is not their intent. Our key idea is that the robot can detect when its repertoire of intents is insufficient to explain the user’s input, and give them back control. This then enables the robot to observe unhindered task execution, learn the new intent behind it, and add it to this repertoire. We demonstrate with both a case study and a user study that our proposed method maintains good performance when the human’s intent is in the robot’s repertoire, outperforms prior shared autonomy approaches when it isn’t, and successfully learns new skills, enabling efficient lifelong learning for confidence-based shared autonomy.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    