skip to main content


Title: Feedback motion planning of legged robots by composing orbital Lyapunov functions using rapidly-exploring random trees
We present a sampling-based framework for feed- back motion planning of legged robots. Our framework is based on switching between limit cycles at a fixed instance of motion, the Poincare ́section(e.g.,apex or touchdown),by finding overlaps between the regions of attraction (ROA) of two limit cycles. First, we assume a candidate orbital Lyapunov function (OLF) and define a ROA at the Poincare ́ section. Next, we solve multiple trajectory optimization problems, one for each sampled initial condition on the ROA to minimize an energy metric and subject to the exponential convergence of the OLF between two steps. The result is a table of control actions and the corresponding initial conditions at the Poincare ́ section. Then we develop a control policy for each control action as a function of the initial condition using deep learning neural networks. The control policy is validated by testing on initial conditions sampled on ROA of randomly chosen limit cycles. Finally, the rapidly-exploring random tree algorithm is adopted to plan transitions between the limit cycles using the ROAs. The approach is demonstrated on a hopper model to achieve velocity and height transitions between steps.  more » « less
Award ID(s):
1816925
NSF-PAR ID:
10104234
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE Conference on Robotics and Automation
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The popular approach of assuming a control policy and then finding the largest region of attraction (ROA) (e.g., sum-of-squares optimization) may lead to conservative estimates of the ROA, especially for highly nonlinear systems. We present a sampling-based approach that starts by assuming a ROA and then fi nds the necessary control policy by performing trajectory optimization on sampled initial conditions. Our method works with black-box models, produces a relatively large ROA, and ensures exponential convergence of the initial conditions to the periodic motion. We demonstrate the approach on a model of hopping and include extensive verification and robustness checks. 
    more » « less
  2. Background: Multivariate pattern analysis (MVPA or pattern decoding) has attracted considerable attention as a sensitive analytic tool for investigations using functional magnetic resonance imaging (fMRI) data. With the introduction of MVPA, however, has come a proliferation of methodological choices confronting the researcher, with few studies to date offering guidance from the vantage point of controlled datasets detached from specific experimental hypotheses. New method: We investigated the impact of four data processing steps on support vector machine (SVM) classification performance aimed at maximizing information capture in the presence of common noise sources. The four techniques included: trial averaging (classifying on separate trial estimates versus condition-based averages), within-run mean centering (centering the data or not), method of cost selection (using a fixed or tuned cost value), and motion-related denoising approach (comparing no denoising versus a variety of nuisance regressions capturing motion-related reference signals). The impact of these approaches was evaluated on real fMRI data from two control ROIs, as well as on simulated pattern data constructed with carefully controlled voxel- and trial-level noise components. Results: We find significant improvements in classification performance across both real and simulated datasets with run-wise trial averaging and mean centering. When averaging trials within conditions of each run, we note a simultaneous increase in the between-subject variability of SVM classification accuracies which we attribute to the reduced size of the test set used to assess the classifier's prediction error. Therefore, we propose a hybrid technique whereby randomly sampled subsets of trials are averaged per run and demonstrate that it helps mitigate the tradeoff between improving signal-to-noise ratio by averaging and losing exemplars in the test set. Comparison with existing methods: Though a handful of empirical studies have employed run-based trial averaging, mean centering, or their combination, such studies have done so without theoretical justification or rigorous testing using control ROIs. Conclusions: Therefore, we intend this study to serve as a practical guide for researchers wishing to optimize pattern decoding without risk of introducing spurious results. 
    more » « less
  3. Applying reinforcement learning (RL) to sparse reward domains is notoriously challenging due to insufficient guiding signals. Common RL techniques for addressing such domains include (1) learning from demonstrations and (2) curriculum learning. While these two approaches have been studied in detail, they have rarely been considered together. This paper aims to do so by introducing a principled task-phasing approach that uses demonstrations to automatically generate a curriculum sequence. Using inverse RL from (suboptimal) demonstrations we define a simple initial task. Our task phasing approach then provides a framework to gradually increase the complexity of the task all the way to the target task, while retuning the RL agent in each phasing iteration. Two approaches for phasing are considered: (1) gradually increasing the proportion of time steps an RL agent is in control, and (2) phasing out a guiding informative reward function. We present conditions that guarantee the convergence of these approaches to an optimal policy. Experimental results on 3 sparse reward domains demonstrate that our task-phasing approaches outperform state-of-the-art approaches with respect to asymptotic performance.

     
    more » « less
  4. null (Ed.)
    Using sampling to estimate the connectivity of high-dimensional configuration spaces has been the theoretical underpinning for effective sampling-based motion planners. Typical strategies either build a roadmap, or a tree as the underlying search structure that connects sampled configurations, with a focus on guaranteeing completeness and optimality as the number of samples tends to infinity. Roadmap-based planners allow preprocessing the space, and can solve multiple kinematic motion planning problems, but need a steering function to connect pairwise-states. Such steering functions are difficult to define for kinodynamic systems, and limit the applicability of roadmaps to motion planning problems with dynamical systems. Recent advances in the analysis of single query tree-based planners has shown that forward search trees based on random propagations are asymptotically optimal. The current work leverages these recent results and proposes a multi-query framework for kinodynamic planning. Bundles of kinodynamic edges can be sampled to cover the state space before the query arrives. Then, given a motion planning query, the connectivity of the state space reachable from the start can be recovered from a forward search tree reasoning about a local neighborhood of the edge bundle from each tree node. The work demonstrates theoretically that considering any constant radial neighborhood during this process is sufficient to guarantee asymptotic optimality. Experimental validation in five and twelve dimensional simulated systems also highlights the ability of the proposed edge bundles to express high-quality kinodynamic solutions. Our approach consistently finds higher quality solutions compared to SST, and RRT, often with faster initial solution times. The strategy of sampling kinodynamic edges is demonstrated to be a promising new paradigm. 
    more » « less
  5. We propose a locomotion framework for bipedal robots consisting of a new motion planning method, dubbed trajectory optimization for walking robots plus (TOWR+), and a new whole-body control method, dubbed implicit hierarchical whole-body controller (IHWBC). For versatility, we consider the use of a composite rigid body (CRB) model to optimize the robot’s walking behavior. The proposed CRB model considers the floating base dynamics while accounting for the effects of the heavy distal mass of humanoids using a pre-trained centroidal inertia network. TOWR+ leverages the phase-based parameterization of its precursor, TOWR, and optimizes for base and end-effectors motions, feet contact wrenches, as well as contact timing and locations without the need to solve a complementary problem or integer program. The use of IHWBC enforces unilateral contact constraints (i.e., non-slip and non-penetration constraints) and a task hierarchy through the cost function, relaxing contact constraints and providing an implicit hierarchy between tasks. This controller provides additional flexibility and smooth task and contact transitions as applied to our 10 degree-of-freedom, line-feet biped robot DRACO. In addition, we introduce a new open-source and light-weight software architecture, dubbed planning and control (PnC), that implements and combines TOWR+ and IHWBC. PnC provides modularity, versatility, and scalability so that the provided modules can be interchanged with other motion planners and whole-body controllers and tested in an end-to-end manner. In the experimental section, we first analyze the performance of TOWR+ using various bipeds. We then demonstrate balancing behaviors on the DRACO hardware using the proposed IHWBC method. Finally, we integrate TOWR+ and IHWBC and demonstrate step-and-stop behaviors on the DRACO hardware. 
    more » « less