A popular paradigm in robotic learning is to train a policy from scratch for every new robot. This is not only inefficient but also often impractical for complex robots. In this work, we consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology. Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal action and/or state distribution being mismatched in different robots. In this paper, we propose a novel method named REvolveR of using continuous evolutionary models for robotic policy transfer implemented in a physics simulator. We interpolate between the source robot and the target robot by finding a continuous evolutionary change of robot parameters. An expert policy on the source robot is transferred through training on a sequence of intermediate robots that gradually evolve into the target robot. Experiments on a physics simulator show that the proposed continuous evolutionary model can effectively transfer the policy across robots and achieve superior sample efficiency on new robots. The proposed method is especially advantageous in sparse reward settings where exploration can be significantly reduced.
more »
« less
Optimization-Based Robot Team Exploration Considering Attrition and Communication Constraints
Exploring robots may fail due to environmental hazards. Thus, robots need to account for the possibility of failure to plan the best exploration paths. Optimizing expected utility enables robots to find plans that balance achievable reward with the inherent risks of exploration. Moreover, when robots rendezvous and communicate to exchange observations, they increase the probability that at least one robot is able to return with the map. Optimal exploration is NP-hard, so we apply a constraint-based approach to enable highly-engineered solution techniques. We model exploration under the possibility of robot failure and communication constraints as an integer, linear program and a generalization of the Vehicle Routing Problem. Empirically, we show that for several scenarios, this formulation produces paths within 50% of a theoretical optimum and achieves twice as much reward as a baseline greedy approach.
more »
« less
- Award ID(s):
- 1823245
- PAR ID:
- 10340559
- Date Published:
- Journal Name:
- 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)In many exploration scenarios, it is important for robots to efficiently explore new areas and constantly communicate results. Mobile robots inherently couple motion and network topology due to the effects of position on wireless propagation, e.g., distance or obstacles between network nodes. Information gain is a useful measure of exploration. However, finding paths that maximize information gain while preserving communication is challenging due to the non-Markovian nature of information gain, discontinuities in network topology, and zero-reward local optima. We address these challenges through an optimization and sampling-based algorithm. Our algorithm scales to 50% more robots and obtains 2-5 times more information relative to path cost compared to baseline planning approaches.more » « less
-
Safe path planning is critical for bipedal robots to operate in safety-critical environments. Common path planning algorithms, such as RRT or RRT*, typically use geometric or kinematic collision check algorithms to ensure collision-free paths toward the target position. However, such approaches may generate non-smooth paths that do not comply with the dynamics constraints of walking robots. It has been shown that the control barrier function (CBF) can be integrated with RRT/RRT* to synthesize dynamically feasible collision-free paths. Yet, existing work has been limited to simple circular or elliptical shape obstacles due to the challenging nature of constructing appropriate barrier functions to represent irregularly shaped obstacles. In this paper, we present a CBF-based RRT* algorithm for bipedal robots to generate a collision-free path through space with multiple polynomial-shaped obstacles. In particular, we used logistic regression to construct polynomial barrier functions from a grid map of the environment to represent irregularly shaped obstacles. Moreover, we developed a multi-step CBF steering controller to ensure the efficiency of free space exploration. The proposed approach was first validated in simulation for a differential drive model, and then experimentally evaluated with a 3D humanoid robot, Digit, in a lab setting with randomly placed obstacles.more » « less
-
This paper presents a novel framework for memory-based navigation for terrestrial robots, utilizing a customized multimodal large language model (MLLM) to interpret visual inputs and generate navigation commands. The system employs a Unitree GO1 robot equipped with a camera to capture environmental images, which are processed by the customized MLLM for navigation. By leveraging a memory-based approach, the robot efficiently reuses previously traversed paths, reducing the need for re-exploration and enhancing navigation efficiency. The hybrid controller in this work features a deliberation unit and a reactive controller for high-level commands and robot alignment. Experimental validation in a hallway-like environment demonstrates that memory-driven navigation improves path retracing and overall performance.more » « less
-
Learning from Demonstration (LfD) is a promising approach to enable Multi-Robot Systems (MRS) to acquire complex skills and behaviors. However, the intricate interactions and coordination challenges in MRS pose significant hurdles for effective LfD. In this paper, we present a novel LfD framework specifically designed for MRS, which leverages visual demonstrations to capture and learn from robot-robot and robot-object interactions. Our framework introduces the concept of Interaction Keypoints (IKs) to transform the visual demonstrations into a representation that facilitates the inference of various skills necessary for the task. The robots then execute the task using sensorimotor actions and reinforcement learning (RL) policies when required. A key feature of our approach is the ability to handle unseen contact-based skills that emerge during the demonstration. In such cases, RL is employed to learn the skill using a classifier-based reward function, eliminating the need for manual reward engineering and ensuring adaptability to environmental changes. We evaluate our framework across a range of mobile robot tasks, covering both behavior-based and contact-based domains. The results demonstrate the effectiveness of our approach in enabling robots to learn complex multi-robot tasks and behaviors from visual demonstrations.more » « less
An official website of the United States government

