skip to main content


This content will become publicly available on December 4, 2024

Title: RealTHASC—a cyber-physical XR testbed for AI-supported real-time human autonomous systems collaborations

Today’s research on human-robot teaming requires the ability to test artificial intelligence (AI) algorithms for perception and decision-making in complex real-world environments. Field experiments, also referred to as experiments “in the wild,” do not provide the level of detailed ground truth necessary for thorough performance comparisons and validation. Experiments on pre-recorded real-world data sets are also significantly limited in their usefulness because they do not allow researchers to test the effectiveness of active robot perception and control or decision strategies in the loop. Additionally, research on large human-robot teams requires tests and experiments that are too costly even for the industry and may result in considerable time losses when experiments go awry. The novel Real-Time Human Autonomous Systems Collaborations (RealTHASC) facility at Cornell University interfaces real and virtual robots and humans with photorealistic simulated environments by implementing new concepts for the seamless integration of wearable sensors, motion capture, physics-based simulations, robot hardware and virtual reality (VR). The result is an extended reality (XR) testbed by which real robots and humans in the laboratory are able to experience virtual worlds, inclusive of virtual agents, through real-time visual feedback and interaction. VR body tracking by DeepMotion is employed in conjunction with the OptiTrack motion capture system to transfer every human subject and robot in the real physical laboratory space into a synthetic virtual environment, thereby constructing corresponding human/robot avatars that not only mimic the behaviors of the real agents but also experience the virtual world through virtual sensors and transmit the sensor data back to the real human/robot agent, all in real time. New cross-domain synthetic environments are created in RealTHASC using Unreal Engine™, bridging the simulation-to-reality gap and allowing for the inclusion of underwater/ground/aerial autonomous vehicles, each equipped with a multi-modal sensor suite. The experimental capabilities offered by RealTHASC are demonstrated through three case studies showcasing mixed real/virtual human/robot interactions in diverse domains, leveraging and complementing the benefits of experimentation in simulation and in the real world.

 
more » « less
Award ID(s):
2223811
NSF-PAR ID:
10478460
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Editor(s):
Gonzalez, D.
Publisher / Repository:
Frontiers
Date Published:
Journal Name:
Frontiers in Virtual Reality
Edition / Version:
1
Volume:
4
ISSN:
2673-4192
Page Range / eLocation ID:
1210211
Subject(s) / Keyword(s):
["robotics","virtual reality","human-autonomy teams","simulation systems","human-robot interaction","multi-robot communication","simulation-to-reality gap","artificial intelligence"]
Format(s):
Medium: X Size: 5309KB Other: pdf
Size(s):
["5309KB"]
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This paper addresses the problem of autonomously deploying an unmanned aerial vehicle in non-trivial settings, by leveraging a manipulator arm mounted on a ground robot, acting as a versatile mobile launch platform. As real-world deployment scenarios for micro aerial vehicles such as searchand- rescue operations often entail exploration and navigation of challenging environments including uneven terrain, cluttered spaces, or even constrained openings and passageways, an often arising problem is that of ensuring a safe take-off location, or safely fitting through narrow openings while in flight. By facilitating launching from the manipulator end-effector, a 6- DoF controllable take-off pose within the arm workspace can be achieved, which allows to properly position and orient the aerial vehicle to initialize the autonomous flight portion of a mission. To accomplish this, we propose a sampling-based planner that respects a) the kinematic constraints of the ground robot / manipulator / aerial robot combination, b) the geometry of the environment as autonomously mapped by the ground robots perception systems, and c) accounts for the aerial robot expected dynamic motion during takeoff. The goal of the proposed planner is to ensure autonomous collision-free initialization of an aerial robotic exploration mission, even within a cluttered constrained environment. At the same time, the ground robot with the mounted manipulator can be used to appropriately position the take-off workspace into areas of interest, effectively acting as a carrier launch platform. We experimentally demonstrate this novel robotic capability through a sequence of experiments that encompass a micro aerial vehicle platform carried and launched from a 6-DoF manipulator arm mounted on a four-wheel robot base. 
    more » « less
  2. We propose a demonstration of the Social Environment for Autonomous Navigation with Virtual Reality (VR) for advancing research in Human-Robot Interaction. In our demonstration, a user controls a virtual avatar in simulation and performs directed navigation tasks with a mobile robot in a warehouse environment. Our demonstration shows how researchers can leverage the immersive nature of VR to study robot navigation from a user-centered perspective in densely populated environments while avoiding physical safety concerns common with operating robots in the real world. This is important for studying interactions with robots driven by algorithms that are early in their development lifecycle. 
    more » « less
  3. While tremendous advances in visual and auditory realism have been made for virtual and augmented reality (VR/AR), introducing a plausible sense of physicality into the virtual world remains challenging. Closing the gap between real-world physicality and immersive virtual experience requires a closed interaction loop: applying user-exerted physical forces to the virtual environment and generating haptic sensations back to the users. However, existing VR/AR solutions either completely ignore the force inputs from the users or rely on obtrusive sensing devices that compromise user experience. By identifying users' muscle activation patterns while engaging in VR/AR, we design a learning-based neural interface for natural and intuitive force inputs. Specifically, we show that lightweight electromyography sensors, resting non-invasively on users' forearm skin, inform and establish a robust understanding of their complex hand activities. Fuelled by a neural-network-based model, our interface can decode finger-wise forces in real-time with 3.3% mean error, and generalize to new users with little calibration. Through an interactive psychophysical study, we show that human perception of virtual objects' physical properties, such as stiffness, can be significantly enhanced by our interface. We further demonstrate that our interface enables ubiquitous control via finger tapping. Ultimately, we envision our findings to push forward research towards more realistic physicality in future VR/AR. 
    more » « less
  4. As augmented and virtual reality (AR/VR) technology matures, a method is desired to represent real-world persons visually and aurally in a virtual scene with high fidelity to craft an immersive and realistic user experience. Current technologies leverage camera and depth sensors to render visual representations of subjects through avatars, and microphone arrays are employed to localize and separate high-quality subject audio through beamforming. However, challenges remain in both realms. In the visual domain, avatars can only map key features (e.g., pose, expression) to a predetermined model, rendering them incapable of capturing the subjects’ full details. Alternatively, high-resolution point clouds can be utilized to represent human subjects. However, such three-dimensional data is computationally expensive to process. In the realm of audio, sound source separation requires prior knowledge of the subjects’ locations. However, it may take unacceptably long for sound source localization algorithms to provide this knowledge, which can still be error-prone, especially with moving objects. These challenges make it difficult for AR systems to produce real-time, high-fidelity representations of human subjects for applications such as AR/VR conferencing that mandate negligible system latency. We present Acuity, a real-time system capable of creating high-fidelity representations of human subjects in a virtual scene both visually and aurally. Acuity isolates subjects from high-resolution input point clouds. It reduces the processing overhead by performing background subtraction at a coarse resolution, then applying the detected bounding boxes to fine-grained point clouds. Meanwhile, Acuity leverages an audiovisual sensor fusion approach to expedite sound source separation. The estimated object location in the visual domain guides the acoustic pipeline to isolate the subjects’ voices without running sound source localization. Our results demonstrate that Acuity can isolate multiple subjects’ high-quality point clouds with a maximum latency of 70 ms and average throughput of over 25 fps, while separating audio in less than 30 ms. We provide the source code of Acuity at: https://github.com/nesl/Acuity. 
    more » « less
  5. The Industrial Internet of Things has increased the number of sensors permanently installed in industrial plants. Yet there will be gaps in coverage due to broken sensors or sparce density in very large plants, such as in the petrochemical industry. Modern emergency response operations are beginning to use Small Unmanned Aerial Systems (sUAS) as remote sensors to provide rapid improved situational awareness. Ground-based sensors are an integral component of overall situational awareness platforms, as they can provide longer-term persistent monitoring that aerial drones are unable to provide. Squishy Robotics and the Berkeley Emergent Space Tensegrities Laboratory have developed hardware and a framework for rapidly deploying sensor robots for integrated ground-aerial disaster response. The semi-autonomous delivery of sensors using tensegrity (tension-integrity) robotics uses structures that are flexible, lightweight, and have high stiffness-to-weight ratios, making them ideal candidates for robust high-altitude deployments. Squishy Robotics has developed a tensegrity robot for commercial use in Hazardous Materials (HazMat) scenarios that is capable of being deployed from commercial drones or other aircraft. Squishy Robots have been successfully deployed with a delicate sensing and communication payload of up to 1,000 ft. This paper describes the framework for optimizing the deployment of emergency sensors spatially over time. AI techniques (e.g., Long Short-Term Memory neural networks) identify regions where sensors would be most valued without requiring humans to enter the potentially dangerous area. The cost function for optimization considers costs of false-positive and false-negative errors. Decisions on mitigation include shutting down the plant or evacuating the local community. The Expected Value of Information (EVI) is used to identify the most valuable type and location of physical sensors to be deployed to increase the decision-analytic value of a sensor network. A case study using data from the Tennessee Eastman process dataset of a chemical plant displayed in OSI Soft is provided. 
    more » « less