skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Data-Driven Latent Space Representation for Robust Bipedal Locomotion Learning
This paper presents a novel framework for learning robust bipedal walking by combining a data-driven state representation with a Reinforcement Learning (RL) based locomotion policy. The framework utilizes an autoencoder to learn a low-dimensional latent space that captures the complex dynamics of bipedal locomotion from existing locomotion data. This reduced dimensional state representation is then used as states for training a robust RL-based gait policy, eliminating the need for heuristic state selections or the use of template models for gait planning. The results demonstrate that the learned latent variables are disentangled and directly correspond to different gaits or speeds, such as moving forward, backward, or walking in place. Compared to traditional template model-based approaches, our framework exhibits superior performance and robustness in simulation. The trained policy effectively tracks a wide range of walking speeds and demonstrates good generalization capabilities to unseen scenarios.  more » « less
Award ID(s):
2144156
PAR ID:
10496141
Author(s) / Creator(s):
Publisher / Repository:
IEEE
Date Published:
Journal Name:
IEEE International Conference on Robotics and Automation
ISSN:
1049-3492
Format(s):
Medium: X
Location:
Yokohama, Japan
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents a novel framework for learning robust bipedal walking by combining a data-driven state representation with a Reinforcement Learning (RL) based locomotion policy. The framework utilizes an autoencoder to learn a low-dimensional latent space that captures the complex dynamics of bipedal locomotion from existing locomotion data. This reduced dimensional state representation is then used as states for training a robust RL-based gait policy, eliminating the need for heuristic state selections or the use of template models for gait planning. The results demonstrate that the learned latent variables are disentangled and directly correspond to different gaits or speeds, such as moving forward, backward, or walking in place. Compared to traditional template model-based approaches, our framework exhibits superior performance and robustness in simulation. The trained policy effectively tracks a wide range of walking speeds and demonstrates good generalization capabilities to unseen scenarios. 
    more » « less
  2. This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions. 
    more » « less
  3. This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot’s I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps. 
    more » « less
  4. null (Ed.)
    Can we design motion primitives for complex legged systems uniformly for different terrain types without neglecting modeling details? This paper presents a method for rapidly generating quadrupedal locomotion on sloped terrains-from modeling to gait generation, to hardware demonstration. At the core of this approach is the observation that a quadrupedal robot can be exactly decomposed into coupled bipedal robots. Formally, this is represented through the framework of coupled control systems, wherein isolated subsystems interact through coupling constraints. We demonstrate this concept in the context of quadrupeds and use it to reduce the gait planning problem for uneven terrains to bipedal walking generation via hybrid zero dynamics. This reduction method allows for the formulation of a nonlinear optimization problem that leverages low-dimensional bipedal representations to generate dynamic walking gaits on slopes for the full-order quadrupedal robot dynamics. The result is the ability to rapidly generate quadrupedal walking gaits on a variety of slopes. We demonstrate these walking behaviors on the Vision 60 quadrupedal robot; in simulation, via walking on a range of sloped terrains of 13°, 15°, 20°, 25°, and, experimentally, through the successful locomotion of 13° and 20° ~ 25° sloped outdoor grasslands. 
    more » « less
  5. Template models, such as the Bipedal Spring-Loaded Inverted Pendulum and the Virtual Pivot Point, have been widely used as low-dimensional representations of the complex dynamics in legged locomotion. Despite their ability to qualitatively match human walking characteristics like M-shaped ground reaction force (GRF) profiles, they often exhibit discrepancies when compared to experimental data, notably in overestimating vertical center of mass (CoM) displacement and underestimating gait event timings (touchdown/ liftoff). This paper hypothesizes that the constant leg stiffness of these models explains the majority of these discrepancies. The study systematically investigates the impact of stiffness variations on the fidelity of model fittings to human data, where an optimization framework is employed to identify optimal leg stiffness trajectories. The study also quantifies the effects of stiffness variations on salient characteristics of human walking (GRF profiles and gait event timing). The optimization framework was applied to 24 subjects walking at 40% to 145% preferred walking speed (PWS). The findings reveal that despite only modifying ground forces in one direction, variable leg stiffness models exhibited a >80% reduction in CoM error across both the B-SLIP and VPP models, while also improving prediction of human GRF profiles. However, the accuracy of gait event timing did not consistently show improvement across all conditions. The resulting stiffness profiles mimic walking characteristics of ankle push-off during double support and reduced CoM vaulting during single support. 
    more » « less