Controller design for bipedal walking on dynamic rigid surfaces (DRSes), which are rigid surfaces moving in the inertial frame (e.g., ships and airplanes), remains largely underexplored. This paper introduces a hierarchical control approach that achieves stable underactuated bipedal walking on a horizontally oscillating DRS. The highest layer of our approach is a real-time motion planner that generates desired global behaviors (i.e., center of mass trajectories and footstep locations) by stabilizing a reduced-order robot model. One key novelty of this layer is the derivation of the reduced-order model by analytically extending the angular momentum based linear inverted pendulum (ALIP) model from stationary to horizontally moving surfaces. The other novelty is the development of a discrete-time foot-placement controller that exponentially stabilizes the hybrid, linear, time-varying ALIP. The middle layer translates the desired global behaviors into the robot’s full-body reference trajectories for all directly actuated degrees of freedom, while the lowest layer exponentially tracks those reference trajectories based on the full-order, hybrid, nonlinear robot model. Simulations confirm that the proposed framework ensures stable walking of a planar underactuated biped under different swaying DRS motions and gait types.
more »
« less
Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning
In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references that guide learning by regularizing behaviors when incorporated in the reward function of a learned policy. This method translates the model's dynamically rich rotational and translational behavior to a full-order robot model and successfully transfers to real hardware. The SRBM's simplicity allows for fast iteration and refinement of behaviors, while the robustness of learning-based controllers allows for highly dynamic motions to be transferred to hardware. Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for a variety of highly dynamic maneuvers as well as our approach to integrating reference trajectories for a high-speed running reinforcement learning policy. We validate our methods on the bipedal robot Cassie on which we were successfully able to demonstrate highly dynamic grounded running gaits up to 3.0 m/s.
more »
« less
- PAR ID:
- 10394232
- Date Published:
- Journal Name:
- 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)
- Page Range / eLocation ID:
- 714 to 721
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Linear policies are the simplest class of policies that can achieve stable bipedal walking behaviors in both simulation and hardware. However, a significant challenge in deploying them widely is the difficulty in extending them to more dynamic behaviors like hopping and running. Therefore, in this work, we propose a new class of linear policies in which template models can be embedded. In particular, we show how to embed Spring Loaded Inverted Pendulum (SLIP) model in the policy class and realize perpetual hopping in arbitrary directions. The spring constant of the template model is learned in addition to the remaining parameters of the policy. Given this spring constant, the goal is to realize hopping trajectories using the SLIP model, which are then tracked by the bipedal robot using the linear policy. Continuous hopping with adjustable heading direction was achieved across different terrains in simulation with heading and lateral velocities of up to O.5m/ sec and 0.05m/ sec, respectively. The policy was then transferred to the hardware, and preliminary results (> 10 steps) of hopping were achieved.more » « less
-
This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions.more » « less
-
This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot’s I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.more » « less
-
Large-scale soft robots have the capability and potential to perform highly dynamic tasks such as hammering a nail into a board, throwing items long distances, or manipulating objects in cluttered environments. This is due to their joints being underdamped and their ability to store potential energy. The soft robots presented in this article are pneumatically actuated and thus have the ability to perform these tasks without the need for large motors or gear trains. However, getting soft robots to perform highly dynamic tasks requires controllers that can track highly dynamic trajectories to complete those tasks. For soft robots, this is a difficult problem to solve due to the uncertainty in their shape and their complicated dynamics and kinematics. This article presents a formulation of a model reference adaptive controller (MRAC) that causes a three-link soft robot arm to behave like a highly dynamic 2nd-order critically damped system. Using the dynamics of a 2nd-order system, we also present a method to generate joint trajectories for throwing a ball to a desired point in Cartesian space. We demonstrate the viability of our joint-level controller in simulation and on hardware with a reported maximum root mean square error of 0.0872 radians between a reference and executed trajectory. We also demonstrate that our combined MRAC controller and trajectory generator can, on average, throw a ball to within 25–28% of a desired landing location for a throwing distance of between 1.5 and 2 m on real hardware.more » « less
An official website of the United States government

