In this paper, we present Combined Learning from demonstration And Motion Planning (CLAMP) as an efficient approach to skill learning and generalizable skill reproduction. CLAMP combines the strengths of Learning from Demonstration (LfD) and motion planning into a unifying framework. We carry out probabilistic inference to find trajectories which are optimal with respect to a given skill and also feasible in different scenarios. We use factor graph optimization to speed up inference. To encode optimality, we provide a new probabilistic skill model based on a stochastic dynamical system. This skill model requires minimal parameter tuning to learn, is suitable to encode skill constraints, and allows efficient inference. Preliminary experimental results showing skill generalization over initial robot state and unforeseen obstacles are presented.
more »
« less
Safe and Robust Robot Learning from Demonstration through Conceptual Constraints
This thesis summary presents research focused on incorporating high-level abstract behavioral requirements, called 'conceptual constraints', into the modeling processes of robot Learning from Demonstration (LfD) techniques. This idea is realized via an LfD algorithm called Concept Constrained Learning from Demonstration. This algorithm encodes motion planning constraints as temporally associated logical formulae of Boolean operators that enforce high-level constraints over portions of the robot's motion plan during learned skill execution. This results in more easily trained, more robust, and safer learned skills. Current work focuses on automating constraint discovery, introducing conceptual constraints into human-aware motion planning algorithms, and expanding upon trajectory alignment techniques for LfD. Future work will focus on how concept constrained algorithms and models are best incorporated into effective interfaces for end-users.
more »
« less
- Award ID(s):
- 1830686
- PAR ID:
- 10191671
- Date Published:
- Journal Name:
- Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction
- Page Range / eLocation ID:
- 588 to 590
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Learning from Demonstration (LfD) enables novice users to teach robots new skills. However, many LfD methods do not facilitate skill maintenance and adaptation. Changes in task requirements or in the environment often reveal the lack of resiliency and adaptability in the skill model. To overcome these limitations, we introduce ARC-LfD: an Augmented Reality (AR) interface for constrained Learning from Demonstration that allows users to maintain, update, and adapt learned skills. This is accomplished through insitu visualizations of learned skills and constraint-based editing of existing skills without requiring further demonstration. We describe the existing algorithmic basis for this system as well as our Augmented Reality interface and the novel capabilities it provides. Finally, we provide three case studies that demonstrate how ARC-LfD enables users to adapt to changes in the environment or task which require a skill to be altered after initial teaching has taken place.more » « less
-
In this work, we develop an open-source surgical simulation environment that includes a realistic model obtained by MRI-scanning a physical phantom, for the purpose of training and evaluating a Learning from Demonstration (LfD) algorithm for autonomous suturing. The LfD algorithm utilizes Dynamic Movement Primitives (DMP) and Locally Weighted Regression (LWR), but focuses on the needle trajectory, rather than the instruments, to obtain better generality with respect to needle grasps. We conduct a user study to collect multiple suturing demonstrations and perform a comprehensive analysis of the ability of the LfD algorithm to generalize from a demonstration at one location in one phantom to different locations in the same phantom and to a different phantom. Our results indicate good generalization, on the order of 91.5%, when learning from more experienced subjects, indicating the need to integrate skill assessment in the future.more » « less
-
Learning from Demonstration (LfD) is a popular approach to endowing robots with skills without having to program them by hand. Typically, LfD relies on human demonstrations in clutter-free environments. This prevents the demonstrations from being affected by irrelevant objects, whose influence can obfuscate the true intention of the human or the constraints of the desired skill. However, it is unrealistic to assume that the robot's environment can always be restructured to remove clutter when capturing human demonstrations. To contend with this problem, we develop an importance weighted batch and incremental skill learning approach, building on a recent inference-based technique for skill representation and reproduction. Our approach reduces unwanted environmental influences on the learned skill, while still capturing the salient human behavior. We provide both batch and incremental versions of our approach and validate our algorithms on a 7-DOF JACO2 manipulator with reaching and placing skills.more » « less
-
This paper presents a framework to learn the reward function underlying high-level sequential tasks from demonstrations. The purpose of reward learning, in the context of learning from demonstration (LfD), is to generate policies that mimic the demonstrator’s policies, thereby enabling imitation learning. We focus on a human-robot interaction(HRI) domain where the goal is to learn and model structured interactions between a human and a robot. Such interactions can be modeled as a partially observable Markov decision process (POMDP) where the partial observability is caused by uncertainties associated with the ways humans respond to different stimuli. The key challenge in finding a good policy in such a POMDP is determining the reward function that was observed by the demonstrator. Existing inverse reinforcement learning(IRL) methods for POMDPs are computationally very expensive and the problem is not well understood. In comparison, IRL algorithms for Markov decision process (MDP) are well defined and computationally efficient. We propose an approach of reward function learning for high-level sequential tasks from human demonstrations where the core idea is to reduce the underlying POMDP to an MDP and apply any efficient MDP-IRL algorithm. Our extensive experiments suggest that the reward function learned this way generates POMDP policies that mimic the policies of the demonstrator well.more » « less
An official website of the United States government

