Learning from Demonstration (LfD) approaches empower end-users
to teach robots novel tasks via demonstrations of the desired behaviors, democratizing
access to robotics. However, current LfD frameworks are not capable
of fast adaptation to heterogeneous human demonstrations nor the large-scale deployment
in ubiquitous robotics applications. In this paper, we propose a novel
LfD framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR).
Our approach (1) leverages learned strategies to construct policy mixtures for fast
adaptation to new demonstrations, allowing for quick end-user personalization,
(2) distills common knowledge across demonstrations, achieving accurate task inference;
and (3) expands its model only when needed in lifelong deployments,
maintaining a concise set of prototypical strategies that can approximate all behaviors
via policy mixtures. We empirically validate that FLAIR achieves adaptability
(i.e., the robot adapts to heterogeneous, user-specific task preferences), efficiency
(i.e., the robot achieves sample-efficient adaptation), and scalability (i.e.,
the model grows sublinearly with the number of demonstrations while maintaining
high performance). FLAIR surpasses benchmarks across three control tasks
with an average 57% improvement in policy returns and an average 78% fewer
episodes required for demonstration modeling using policy mixtures. Finally, we
demonstrate the success of FLAIR in a table tennis task and find users rate FLAIR
as having higher task (p < .05) and personalization (p < .05) performance.
more »
« less
This content will become publicly available on March 11, 2025
Enhancing Safety in Learning from Demonstration Algorithms via Control Barrier Function Shielding
Learning from Demonstration (LfD) is a powerful method for nonroboticists
end-users to teach robots new tasks, enabling them to
customize the robot behavior. However, modern LfD techniques do
not explicitly synthesize safe robot behavior, which limits the deployability
of these approaches in the real world. To enforce safety
in LfD without relying on experts, we propose a new framework,
ShiElding with Control barrier fUnctions in inverse REinforcement
learning (SECURE), which learns a customized Control Barrier
Function (CBF) from end-users that prevents robots from taking
unsafe actions while imposing little interference with the task completion.
We evaluate SECURE in three sets of experiments. First,
we empirically validate SECURE learns a high-quality CBF from
demonstrations and outperforms conventional LfD methods on simulated
robotic and autonomous driving tasks with improvements
on safety by up to 100%. Second, we demonstrate that roboticists
can leverage SECURE to outperform conventional LfD approaches
on a real-world knife-cutting, meal-preparation task by 12.5% in
task completion while driving the number of safety violations to
zero. Finally, we demonstrate in a user study that non-roboticists
can use SECURE to efectively teach the robot safe policies that
avoid collisions with the person and prevent cofee from spilling.
more »
« less
- Award ID(s):
- 2219755
- NSF-PAR ID:
- 10499423
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction
- ISBN:
- 9798400703225
- Page Range / eLocation ID:
- 820 to 829
- Subject(s) / Keyword(s):
- Learning from Demonstration, Control Barrier Function, Safety
- Format(s):
- Medium: X
- Location:
- Boulder CO USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)With growing access to versatile robotics, it is beneficial for end users to be able to teach robots tasks without needing to code a control policy. One possibility is to teach the robot through successful task executions. However, near-optimal demonstrations of a task can be difficult to provide and even successful demonstrations can fail to capture task aspects key to robust skill replication. Here, we propose a learning from demonstration (LfD) approach that enables learning of robust task definitions without the need for near-optimal demonstrations. We present a novel algorithmic framework for learning task specifications based on the ergodic metric—a measure of information content in motion. Moreover, we make use of negative demonstrations— demonstrations of what not to do—and show that they can help compensate for imperfect demonstrations, reduce the number of demonstrations needed, and highlight crucial task elements improving robot performance. In a proof-of-concept example of cart-pole inversion, we show that negative demonstrations alone can be sufficient to successfully learn and recreate a skill. Through a human subject study with 24 participants, we show that consistently more information about a task can be captured from combined positive and negative (posneg) demonstrations than from the same amount of just positive demonstrations. Finally, we demonstrate our learning approach on simulated tasks of target reaching and table cleaning with a 7-DoF Franka arm. Our results point towards a future with robust, data efficient LfD for novice users.more » « less
-
Robot-mediated therapy is an emerging field of research seeking to improve therapy for children with Autism Spectrum Disorder (ASD). Current approaches to autonomous robot-mediated therapy often focus on having a robot teach a single skill to children with ASD and lack a personalized approach to each individual. More recently, Learning from Demonstration (LfD) approaches are being explored to teach socially assistive robots to deliver personalized interventions after they have been deployed but these approaches require large amounts of demonstrations and utilize learning models that cannot be easily interpreted. In this work, we present a LfD system capable of learning the delivery of autism therapies in a data-efficient manner utilizing learning models that are inherently interpretable. The LfD system learns a behavioral model of the task with minimal supervision via hierarchical clustering and then learns an interpretable policy to determine when to execute the learned behaviors. The system is able to learn from less than an hour of demonstrations and for each of its predictions can identify demonstrated instances that contributed to its decision. The system performs well under unsupervised conditions and achieves even better performance with a low-effort human correction process that is enabled by the interpretable model.more » « less
-
The field of end-user robot programming seeks to develop methods that empower non-expert programmers to task and modify robot operations. In doing so, researchers may enhance robot flexibility and broaden the scope of robot deployments into the real world. We introduce PRogramAR (Programming Robots using Augmented Reality), a novel end-user robot programming system that combines the intuitive visual feedback of augmented reality (AR) with the simplistic and responsive paradigm of trigger-action programming (TAP) to facilitate human-robot collaboration. Through PRogramAR, users are able to rapidly author task rules and desired reactive robot behaviors, while specifying task constraints and observing program feedback contextualized directly in the real world. PRogramAR provides feedback by simulating the robot’s intended behavior and providing instant evaluation of TAP rule executability to help end users better understand and debug their programs during development. In a system validation, 17 end users ranging from ages 18 to 83 used PRogramAR to program a robot to assist them in completing three collaborative tasks. Our results demonstrate how merging the benefits of AR and TAP using elements from prior robot programming research into a single novel system can successfully enhance the robot programming process for non-expert users.more » « less
-
This work provides a decentralized approach to safety by combining tools from control barrier functions (CBF) and nonlinear model predictive control (NMPC). It is shown how leveraging backup safety controllers allows for the robust implementation of CBF over the NMPC computation horizon, ensuring safety in nonlinear systems with actuation constraints. A leader-follower approach to control barrier functions (LFCBF) enforcement will be introduced as a strategy to enable a robot leader, in a multi-robot interactions, to complete its task in minimum time, hence aggressively maneuvering. An algorithmic implementation of the proposed solution is provided and safety is verified via simulation.more » « less