Tensegrity robots, composed of rigid rods and flexible cables, are difficult to accurately model and control given the presence of complex dynamics and high number of DoFs. Differentiable physics engines have been recently proposed as a data-driven approach for model identification of such complex robotic systems. These engines are often executed at a high-frequency to achieve accurate simulation. Ground truth trajectories for training differentiable engines, however, are not typically available at such high frequencies due to limitations of real-world sensors. The present work focuses on this frequency mismatch, which impacts the modeling accuracy. We proposed a recurrent structure for a differentiable physics engine of tensegrity robots, which can be trained effectively even with low-frequency trajectories. To train this new recurrent engine in a robust way, this work introduces relative to prior work: (i) a new implicit integration scheme, (ii) a progressive training pipeline, and (iii) a differentiable collision checker. A model of NASA's icosahedron SUPERballBot on MuJoCo is used as the ground truth system to collect training data. Simulated experiments show that once the recurrent differentiable engine has been trained given the low-frequency trajectories from MuJoCo, it is able to match the behavior of MuJoCo's system. The criterion for success is whether a locomotion strategy learned using the differentiable engine can be transferred back to the ground-truth system and result in a similar motion. Notably, the amount of ground truth data needed to train the differentiable engine, such that the policy is transferable to the ground truth system, is 1% of the data needed to train the policy directly on the ground-truth system.
more »
« less
Fast Model Identification via Physics Engines for Data-Efficient Policy Search
This paper presents a practical approach for identifying unknown mechanical parameters, such as mass and friction models of manipulated rigid objects or actuated robotic links, in a succinct manner that aims to improve the performance of policy search algorithms. Key features of this approach are the use of off-the-shelf physics engines and the adaptation of a black-box Bayesian optimization framework for this purpose. The physics engine is used to reproduce in simulation experiments that are performed on a real robot, and the mechanical parameters of the simulated system are automatically fine-tuned so that the simulated trajectories match with the real ones. The optimized model is then used for learning a policy in simulation, before safely deploying it on the real robot. Given the well-known limitations of physics engines in modeling real-world objects, it is generally not possible to find a mechanical model that reproduces in simulation the real trajectories exactly. Moreover, there are many scenarios where a near-optimal policy can be found without having a perfect knowledge of the system. Therefore, searching for a perfect model may not be worth the computational effort in practice. The proposed approach aims then to identify a model that is good enough to approximate the value of a locally optimal policy with a certain confidence, instead of spending all the computational resources on searching for the most accurate model. Empirical evaluations, performed in simulation and on a real robotic manipulation task, show that model identification via physics engines can significantly boost the performance of policy search algorithms that are popular in robotics, such as TRPO, PoWER and PILCO, with no additional real-world data.
more »
« less
- PAR ID:
- 10086344
- Date Published:
- Journal Name:
- International Joint Conference on Artificial Intelligence (IJCAI)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Weitzenfeld, A (Ed.)Studies involving the group predator behavior of wolves have inspired multiple robotic architectures to mimic these biological behaviors in their designs and research. In this work, we aim to use robotic systems to mimic wolf packs' single and group behavior. This work aims to extend the original research by Weitzenfeld et al [7] and evaluate under a new multi-robot robot system architecture. The multiple robot architecture includes a 'Prey' pursued by a wolf pack consisting of an 'Alpha' and 'Beta' robotic group. The Alpha Wolf' will be the group leader, searching and tracking the 'Prey.' At the same time, the multiple Beta 'Wolves' will follow behind the Alpha, tracking and maintaining a set distance in the formation. The robotic systems used are multiple raspberry pi-robots designed in the USF bio-robotics lab that use a combination of color cameras and distance sensors to assist the Beta 'Wolves' in keeping a set distance between the Alpha "Wolf" and themselves. Several experiments were performed in simulation, using Webots, and with physical robots. An analysis was done comparing the performance of the physical robot in the real world to the virtual robot in the simulated environment.more » « less
-
This paper investigates data-efficient methods for learning robust control policies. Reinforcement learning has emerged as an effective approach to learn control policies by interacting directly with the plant, but it requires a significant number of example trajectories to converge to the optimal policy. Combining model-free reinforcement learning with model-based control methods achieves better data-efficiency via simultaneous system identification and controller synthesis. We study a novel approach that exploits the existence of approximate physics models to accelerate the learning of control policies. The proposed approach consists of iterating through three key steps: evaluating a selected policy on the real-world plant and recording trajectories, building a Gaussian process model to predict the reality-gap of a parametric physics model in the neighborhood of the selected policy, and synthesizing a new policy using reinforcement learning on the refined physics model that most likely approximates the real plant. The approach converges to an optimal policy as well as an approximate physics model. The real world experiments are limited to evaluating only promising candidate policies, and the use of Gaussian processes minimizes the number of required real world trajectories. We demonstrate the effectiveness of our techniques on a set of simulation case-studies using OpenAI gym environments.more » « less
-
In recent years, the increasing complexity and safety-critical nature of robotic tasks have highlighted the importance of accurate and reliable robot models. This trend has led to a growing belief that, given enough data, traditional physics-based robot models can be replaced by appropriately trained deep networks or their variants. Simultaneously, there has been a renewed interest in physics-based simulation, fueled by the widespread use of simulators to train reinforcement learning algorithms in the sim-to-real paradigm. The primary objective of this review is to present a unified perspective on the process of determining robot models from data, commonly known as system identification or model learning in different subfields. The review aims to illuminate the key challenges encountered and highlight recent advancements in system identification for robotics. Specifically, we focus on recent breakthroughs that leverage the geometry of the identification problem and incorporate physics-based knowledge beyond mere first-principles model parameterizations. Through these efforts, we strive to provide a contemporary outlook on this problem, bridging classical findings with the latest progress in the field.more » « less
-
This paper proposes an approach to domain transfer based on a pairwise loss function that helps transfer control policies learned in simulation onto a real robot. We explore the idea in the context of a “category level” manipulation task where a control policy is learned that enables a robot to perform a mating task involving novel objects. We explore the case where depth images are used as the main form of sensor input. Our experimental results demonstrate that proposed method consistently outperforms baseline methods that train only in simulation or that combine real and simulated data in a naive waymore » « less
An official website of the United States government

