skip to main content


Title: Augmenting Simulation Data with Sensor Effects for Improved Domain Transfer
Simulation provides vast benefits for the field of robotics and Human-Robot Interaction (HRI). This study investigates how sensor effects seen in the real domain can be modeled in simulation and what role they play in effective Sim2Real domain transfer for learned perception models. The study considers introducing naive noise approaches such as additive Gaussian and salt and pepper noise as well as data-driven sensor effects models into simulation for representing Microsoft Kinect sensor capabilities and phenomena seen on real world systems. This study quantifies the benefit of multiple approaches to modeling sensor effects in simulation for Sim2Real domain transfer by their object classification improvements in the real domain. User studies are conducted to address hypotheses by training grounded language models in each of the sensor effects modeling cases and evaluated on the robot's interaction capabilities in the real domain. In addition to grounded language performance metrics, user study evaluation includes surveys on the human participant's assessment of the robot's capabilities in the real domain. Results from this pilot study show benefits to modeling sensor noise in simulation for Sim2Real domain transfer. This study also begins to explore the effects that such models have on human-robot interactions.  more » « less
Award ID(s):
2024878 1813223
NSF-PAR ID:
10382789
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Tenth International Workshop on Assistive Computer Vision and Robotics (ACVR) at ECCV
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gonzalez, D. (Ed.)

    Today’s research on human-robot teaming requires the ability to test artificial intelligence (AI) algorithms for perception and decision-making in complex real-world environments. Field experiments, also referred to as experiments “in the wild,” do not provide the level of detailed ground truth necessary for thorough performance comparisons and validation. Experiments on pre-recorded real-world data sets are also significantly limited in their usefulness because they do not allow researchers to test the effectiveness of active robot perception and control or decision strategies in the loop. Additionally, research on large human-robot teams requires tests and experiments that are too costly even for the industry and may result in considerable time losses when experiments go awry. The novel Real-Time Human Autonomous Systems Collaborations (RealTHASC) facility at Cornell University interfaces real and virtual robots and humans with photorealistic simulated environments by implementing new concepts for the seamless integration of wearable sensors, motion capture, physics-based simulations, robot hardware and virtual reality (VR). The result is an extended reality (XR) testbed by which real robots and humans in the laboratory are able to experience virtual worlds, inclusive of virtual agents, through real-time visual feedback and interaction. VR body tracking by DeepMotion is employed in conjunction with the OptiTrack motion capture system to transfer every human subject and robot in the real physical laboratory space into a synthetic virtual environment, thereby constructing corresponding human/robot avatars that not only mimic the behaviors of the real agents but also experience the virtual world through virtual sensors and transmit the sensor data back to the real human/robot agent, all in real time. New cross-domain synthetic environments are created in RealTHASC using Unreal Engine™, bridging the simulation-to-reality gap and allowing for the inclusion of underwater/ground/aerial autonomous vehicles, each equipped with a multi-modal sensor suite. The experimental capabilities offered by RealTHASC are demonstrated through three case studies showcasing mixed real/virtual human/robot interactions in diverse domains, leveraging and complementing the benefits of experimentation in simulation and in the real world.

     
    more » « less
  2. null (Ed.)
    Learning policies in simulation is promising for reducing human effort when training robot controllers. This is especially true for soft robots that are more adaptive and safe but also more difficult to accurately model and control. The sim2real gap is the main barrier to successfully transfer policies from simulation to a real robot. System identification can be applied to reduce this gap but traditional identification methods require a lot of manual tuning. Data-driven alternatives can tune dynamical models directly from data but are often data hungry, which also incorporates human effort in collecting data. This work proposes a data-driven, end-to-end differentiable simulator focused on the exciting but challenging domain of tensegrity robots. To the best of the authors’ knowledge, this is the first differentiable physics engine for tensegrity robots that supports cable, contact, and actuation modeling. The aim is to develop a reasonably simplified, data-driven simulation, which can learn approximate dynamics with limited ground truth data. The dynamics must be accurate enough to generate policies that can be transferred back to the ground-truth system. As a first step in this direction, the current work demonstrates sim2sim transfer, where the unknown physical model of MuJoCo acts as a ground truth system. Two different tensegrity robots are used for evaluation and learning of locomotion policies, a 6-bar and a 3-bar tensegrity. The results indicate that only 0.25% of ground truth data are needed to train a policy that works on the ground truth system when the differentiable engine is used for training against training the policy directly on the ground truth system. 
    more » « less
  3. We consider the problem of multi-robot sensor coverage, which deals with deploying a multi-robot team in an environment and optimizing the sensing quality of the overall environment. As real-world environments involve a variety of sensory information, and individual robots are limited in their available number of sensors, successful multi-robot sensor coverage requires the deployment of robots in such a way that each individual team member’s sensing quality is maximized. Additionally, because individual robots have varying complements of sensors and both robots and sensors can fail, robots must be able to adapt and adjust how they value each sensing capability in order to obtain the most complete view of the environment, even through changes in team composition. We introduce a novel formulation for sensor coverage by multi-robot teams with heterogeneous sensing capabilities that maximizes each robot's sensing quality, balancing the varying sensing capabilities of individual robots based on the overall team composition. We propose a solution based on regularized optimization that uses sparsity-inducing terms to ensure a robot team focuses on all possible event types, and which we show is proven to converge to the optimal solution. Through extensive simulation, we show that our approach is able to effectively deploy a multi-robot team to maximize the sensing quality of an environment, responding to failures in the multi-robot team more robustly than non-adaptive approaches. 
    more » « less
  4. Tan, Jie ; Toussaint, Marc (Ed.)
    With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation [1, 2, 3, 4]. These generic models are able to interpret complex tasks using language commands, but they often have difficulties generalizing to out-of-distribution objects due to the inability of low-level action primitives. In contrast, existing task-specific models [5, 6] excel in low-level manipulation of unknown objects, but only work for a single type of action. To bridge this gap, we present M2T2, a single model that supplies different types of low-level actions that work robustly on arbitrary objects in cluttered scenes. M2T2 is a transformer model which reasons about contact points and predicts valid gripper poses for different action modes given a raw point cloud of the scene. Trained on a large-scale synthetic dataset with 128K scenes, M2T2 achieves zero-shot sim2real transfer on the real robot, outperforming the baseline system with state- of-the-art task-specific models by about 19% in overall performance and 37.5% in challenging scenes where the object needs to be re-oriented for collision- free placement. M2T2 also achieves state-of-the-art results on a subset of language conditioned tasks in RLBench [7]. Videos of robot experiments on unseen objects in both real world and simulation are available on our project website https://m2-t2.github.io. 
    more » « less
  5. null (Ed.)
    Across a wide variety of domains, artificial agents that can adapt and personalize to users have potential to improve and transform how social services are provided. Because of the need for personalized interaction data to drive this process, long-term (or longitudinal) interactions between users and agents, which unfold over a series of distinct interaction sessions, have attracted substantial research interest. In recognition of the expanded scope and structure of a long-term interaction, researchers are also adjusting the personalization models and algorithms used, orienting toward “continual learning” methods, which do not assume a stationary modeling target and explicitly account for the temporal context of training data. In parallel, researchers have also studied the effect of “multitask personalization,” an approach in which an agent interacts with users over multiple different tasks contexts throughout the course of a long-term interaction and learns personalized models of a user that are transferrable across these tasks. In this paper, we unite these two paradigms under the framework of “Lifelong Personalization,” analyzing the effect of multitask personalization applied to dynamic, non-stationary targets. We extend the multi-task personalization approach to the more complex and realistic scenario of modeling dynamic learners over time, focusing in particular on interactive scenarios in which the modeling agent plays an active role in teaching the student whose knowledge the agent is simultaneously attempting to model. Inspired by the way in which agents use active learning to select new training data based on domain context, we augment a Gaussian Process-based multitask personalization model with a mechanism to actively and continually manage its own training data, allowing a modeling agent to remove or reduce the weight of observed data from its training set, based on interactive context cues. We evaluate this method in a series of simulation experiments comparing different approaches to continual and multitask learning on simulated student data. We expect this method to substantially improve learning in Gaussian Process models in dynamic domains, establishing Gaussian Processes as another flexible modeling tool for Long-term Human-Robot Interaction (HRI) Studies. 
    more » « less