This paper presents a novel framework for learning robust bipedal walking by combining a data-driven state representation with a Reinforcement Learning (RL) based locomotion policy. The framework utilizes an autoencoder to learn a low-dimensional latent space that captures the complex dynamics of bipedal locomotion from existing locomotion data. This reduced dimensional state representation is then used as states for training a robust RL-based gait policy, eliminating the need for heuristic state selections or the use of template models for gait planning. The results demonstrate that the learned latent variables are disentangled and directly correspond to different gaits or speeds, such as moving forward, backward, or walking in place. Compared to traditional template model-based approaches, our framework exhibits superior performance and robustness in simulation. The trained policy effectively tracks a wide range of walking speeds and demonstrates good generalization capabilities to unseen scenarios.
more »
« less
Data-Driven Latent Space Representation for Robust Bipedal Locomotion Learning
This paper presents a novel framework for learning robust bipedal walking by combining a data-driven state representation with a Reinforcement Learning (RL) based locomotion policy. The framework utilizes an autoencoder to learn a low-dimensional latent space that captures the complex dynamics of bipedal locomotion from existing locomotion data. This reduced dimensional state representation is then used as states for training a robust RL-based gait policy, eliminating the need for heuristic state selections or the use of template models for gait planning. The results demonstrate that the learned latent variables are disentangled and directly correspond to different gaits or speeds, such as moving forward, backward, or walking in place. Compared to traditional template model-based approaches, our framework exhibits superior performance and robustness in simulation. The trained policy effectively tracks a wide range of walking speeds and demonstrates good generalization capabilities to unseen scenarios.
more »
« less
- Award ID(s):
- 2144156
- PAR ID:
- 10579189
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3503-8457-4
- Page Range / eLocation ID:
- 1172 to 1178
- Format(s):
- Medium: X
- Location:
- Yokohama, Japan
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This work presents a hierarchical framework for bipedal locomotion that combines a Reinforcement Learning (RL)-based high-level (HL) planner policy for the online generation of task space commands with a model-based low-level (LL) controller to track the desired task space trajectories. Different from traditional end-to-end learning approaches, our HL policy takes insights from the angular momentum-based linear inverted pendulum (ALIP) to carefully design the observation and action spaces of the Markov Decision Process (MDP). This simple yet effective design creates an insightful mapping between a low-dimensional state that effectively captures the complex dynamics of bipedal locomotion and a set of task space outputs that shape the walking gait of the robot. The HL policy is agnostic to the task space LL controller, which increases the flexibility of the design and generalization of the framework to other bipedal robots. This hierarchical design results in a learning-based framework with improved performance, data efficiency, and robustness compared with the ALIP model-based approach and state-of-the-art learning-based frameworks for bipedal locomotion. The proposed hierarchical controller is tested in three different robots, Rabbit, a five-link underactuated planar biped; Walker2D, a seven-link fully-actuated planar biped; and Digit, a 3D humanoid robot with 20 actuated joints. The trained policy naturally learns human-like locomotion behaviors and is able to effectively track a wide range of walking speeds while preserving the robustness and stability of the walking gait even under adversarial conditions.more » « less
-
This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a novel dual-history architecture, utilizing both a long-term and short-term input/output (I/O) history of the robot. This control architecture, when trained through the proposed end-to-end RL approach, consistently outperforms other methods across a diverse range of skills in both simulation and the real world. The study also delves into the adaptivity and robustness introduced by the proposed RL system in developing locomotion controllers. We demonstrate that the proposed architecture can adapt to both time-invariant dynamics shifts and time-variant changes, such as contact events, by effectively using the robot’s I/O history. Additionally, we identify task randomization as another key source of robustness, fostering better task generalization and compliance to disturbances. The resulting control policies can be successfully deployed on Cassie, a torque-controlled human-sized bipedal robot. This work pushes the limits of agility for bipedal robots through extensive real-world experiments. We demonstrate a diverse range of locomotion skills, including: robust standing, versatile walking, fast running with a demonstration of a 400-meter dash, and a diverse set of jumping skills, such as standing long jumps and high jumps.more » « less
-
null (Ed.)Can we design motion primitives for complex legged systems uniformly for different terrain types without neglecting modeling details? This paper presents a method for rapidly generating quadrupedal locomotion on sloped terrains-from modeling to gait generation, to hardware demonstration. At the core of this approach is the observation that a quadrupedal robot can be exactly decomposed into coupled bipedal robots. Formally, this is represented through the framework of coupled control systems, wherein isolated subsystems interact through coupling constraints. We demonstrate this concept in the context of quadrupeds and use it to reduce the gait planning problem for uneven terrains to bipedal walking generation via hybrid zero dynamics. This reduction method allows for the formulation of a nonlinear optimization problem that leverages low-dimensional bipedal representations to generate dynamic walking gaits on slopes for the full-order quadrupedal robot dynamics. The result is the ability to rapidly generate quadrupedal walking gaits on a variety of slopes. We demonstrate these walking behaviors on the Vision 60 quadrupedal robot; in simulation, via walking on a range of sloped terrains of 13°, 15°, 20°, 25°, and, experimentally, through the successful locomotion of 13° and 20° ~ 25° sloped outdoor grasslands.more » « less
-
null (Ed.)This paper systematically decomposes a quadrupedal robot into bipeds to rapidly generate walking gaits and then recomposes these gaits to obtain quadrupedal locomotion. We begin by decomposing the full-order, nonlinear and hybrid dynamics of a three-dimensional quadrupedal robot, including its continuous and discrete dynamics, into two bipedal systems that are subject to external forces. Using the hybrid zero dynamics (HZD) framework, gaits for these bipedal robots can be rapidly generated (on the order of seconds) along with corresponding controllers. The decomposition is achieved in such a way that the bipedal walking gaits and controllers can be composed to yield dynamic walking gaits for the original quadrupedal robot - the result is the rapid generation of dynamic quadruped gaits utilizing the full-order dynamics. This methodology is demonstrated through the rapid generation (3.96 seconds on average) of four stepping-in-place gaits and one diagonally symmetric ambling gait at 0.35 m/s on a quadrupedal robot - the Vision 60, with 36 state variables and 12 control inputs - both in simulation and through outdoor experiments. This suggested a new approach for fast quadrupedal trajectory planning using full-body dynamics, without the need for empirical model simplification, wherein methods from dynamic bipedal walking can be directly applied to quadrupeds.more » « less