skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning
In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references that guide learning by regularizing behaviors when incorporated in the reward function of a learned policy. This method translates the model's dynamically rich rotational and translational behavior to a full-order robot model and successfully transfers to real hardware. The SRBM's simplicity allows for fast iteration and refinement of behaviors, while the robustness of learning-based controllers allows for highly dynamic motions to be transferred to hardware. Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for a variety of highly dynamic maneuvers as well as our approach to integrating reference trajectories for a high-speed running reinforcement learning policy. We validate our methods on the bipedal robot Cassie on which we were successfully able to demonstrate highly dynamic grounded running gaits up to 3.0 m/s.  more » « less
Award ID(s):
1653220 1849343
PAR ID:
10394232
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids)
Page Range / eLocation ID:
714 to 721
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Controller design for bipedal walking on dynamic rigid surfaces (DRSes), which are rigid surfaces moving in the inertial frame (e.g., ships and airplanes), remains largely underexplored. This paper introduces a hierarchical control approach that achieves stable underactuated bipedal walking on a horizontally oscillating DRS. The highest layer of our approach is a real-time motion planner that generates desired global behaviors (i.e., center of mass trajectories and footstep locations) by stabilizing a reduced-order robot model. One key novelty of this layer is the derivation of the reduced-order model by analytically extending the angular momentum based linear inverted pendulum (ALIP) model from stationary to horizontally moving surfaces. The other novelty is the development of a discrete-time foot-placement controller that exponentially stabilizes the hybrid, linear, time-varying ALIP. The middle layer translates the desired global behaviors into the robot’s full-body reference trajectories for all directly actuated degrees of freedom, while the lowest layer exponentially tracks those reference trajectories based on the full-order, hybrid, nonlinear robot model. Simulations confirm that the proposed framework ensures stable walking of a planar underactuated biped under different swaying DRS motions and gait types. 
    more » « less
  2. Linear policies are the simplest class of policies that can achieve stable bipedal walking behaviors in both simulation and hardware. However, a significant challenge in deploying them widely is the difficulty in extending them to more dynamic behaviors like hopping and running. Therefore, in this work, we propose a new class of linear policies in which template models can be embedded. In particular, we show how to embed Spring Loaded Inverted Pendulum (SLIP) model in the policy class and realize perpetual hopping in arbitrary directions. The spring constant of the template model is learned in addition to the remaining parameters of the policy. Given this spring constant, the goal is to realize hopping trajectories using the SLIP model, which are then tracked by the bipedal robot using the linear policy. Continuous hopping with adjustable heading direction was achieved across different terrains in simulation with heading and lateral velocities of up to O.5m/ sec and 0.05m/ sec, respectively. The policy was then transferred to the hardware, and preliminary results (> 10 steps) of hopping were achieved. 
    more » « less
  3. This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions. 
    more » « less
  4. null (Ed.)
    Can we design motion primitives for complex legged systems uniformly for different terrain types without neglecting modeling details? This paper presents a method for rapidly generating quadrupedal locomotion on sloped terrains-from modeling to gait generation, to hardware demonstration. At the core of this approach is the observation that a quadrupedal robot can be exactly decomposed into coupled bipedal robots. Formally, this is represented through the framework of coupled control systems, wherein isolated subsystems interact through coupling constraints. We demonstrate this concept in the context of quadrupeds and use it to reduce the gait planning problem for uneven terrains to bipedal walking generation via hybrid zero dynamics. This reduction method allows for the formulation of a nonlinear optimization problem that leverages low-dimensional bipedal representations to generate dynamic walking gaits on slopes for the full-order quadrupedal robot dynamics. The result is the ability to rapidly generate quadrupedal walking gaits on a variety of slopes. We demonstrate these walking behaviors on the Vision 60 quadrupedal robot; in simulation, via walking on a range of sloped terrains of 13°, 15°, 20°, 25°, and, experimentally, through the successful locomotion of 13° and 20° ~ 25° sloped outdoor grasslands. 
    more » « less
  5. This study proposes a novel planning framework based on a model predictive control formulation that incorporates signal temporal logic (STL) specifications for task completion guarantees and robustness quantification. This marks the first-ever study to apply STL-guided trajectory optimization for bipedal locomotion push recovery, where the robot experiences unexpected disturbances. Existing recovery strategies often struggle with complex task logic reasoning and locomotion robustness evaluation, making them susceptible to failures due to inappropriate recovery strategies or insufficient robustness. To address this issue, the STL-guided framework generates optimal and safe recovery trajectories that simultaneously satisfy the task specification and maximize the locomotion robustness. Our framework outperforms a state-of-the-art locomotion controller in a high-fidelity dynamic simulation, especially in scenarios involving crossed-leg maneuvers. Furthermore, it demonstrates versatility in tasks such as locomotion on stepping stones, where the robot must select from a set of disjointed footholds to maneuver successfully. 
    more » « less