skip to main content


Title: EURECA: Enhanced Understanding of Real Environments via Crowd Assistance
Indoor robots hold the promise of automatically handling mundane daily tasks, helping to improve access for people with disabilities, and providing on-demand access to remote physical environments. Unfortunately, the ability to understand never-before-seen objects in scenes where new items may be added (e.g., purchased) or altered (e.g., damaged) on a regular basis remains an open challenge for robotics. In this paper, we introduce EURECA, a mixed-initiative system that leverages online crowds of human contributors to help robots robustly identify 3D point cloud segments corresponding to user-referenced objects in near real-time. EURECA allows robots to understand multi-object 3D scenes on-the-fly (in ∼40 seconds) by providing groups of non-expert crowd workers with intelligent tools that can segment objects more quickly (∼70% faster) and more accurately than individuals. More broadly, EURECA introduces the first real-time crowdsourcing tool that addresses the challenge of learning about new objects in real-world settings, creating a new source of data for training robots online, as well as a platform for studying mixed-initiative crowdsourcing workflows for understanding 3D scenes.  more » « less
Award ID(s):
1638047
NSF-PAR ID:
10066635
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
AAAI Conference on Human Computation (HCOMP 2018)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Video scene analysis is a well-investigated area where researchers have devoted efforts to detect and classify people and objects in the scene. However, real-life scenes are more complex: the intrinsic states of the objects (e.g., machine operating states or human vital signals) are often overlooked by vision-based scene analysis. Recent work has proposed a radio frequency (RF) sensing technique, wireless vibrometry, that employs wireless signals to sense subtle vibrations from the objects and infer their internal states. We envision that the combination of video scene analysis with wireless vibrometry form a more comprehensive understanding of the scene, namely "rich scene analysis". However, the RF sensors used in wireless vibrometry only provide time series, and it is challenging to associate these time series data with multiple real-world objects. We propose a real-time RF-vision sensor fusion system, Capricorn, that efficiently builds a cross-modal correspondence between visual pixels and RF time series to better understand the complex natures of a scene. The vision sensors in Capricorn model the surrounding environment in 3D and obtain the distances of different objects. In the RF domain, the distance is proportional to the signal time-of-flight (ToF), and we can leverage the ToF to separate the RF time series corresponding to each object. The RF-vision sensor fusion in Capricorn brings multiple benefits. The vision sensors provide environmental contexts to guide the processing of RF data, which helps us select the most appropriate algorithms and models. Meanwhile, the RF sensor yields additional information that is originally invisible to vision sensors, providing insight into objects' intrinsic states. Our extensive evaluations show that Capricorn real-timely monitors multiple appliances' operating status with an accuracy of 97%+ and recovers vital signals like respirations from multiple people. A video (https://youtu.be/b-5nav3Fi78) demonstrates the capability of Capricorn. 
    more » « less
  2. Robotics has emerged as one of the most popular subjects in STEM (Science, Technology, Engineering, and Mathematics) education for students in elementary, middle, and high schools, providing them with an opportunity to gain knowledge of engineering and technology. In recent years, flying robots (or drones) have also gained popularity as teaching tool to impart the fundamentals of computer programming to high school students. However, despite completing the programming course, students may still lack an understanding of the working principle of drones. This paper proposes an approach to teach students the basic principles of drone aeronautics through laboratory programming. This course was designed by professors from Vaughn College of Aeronautics and Technology for high school students who work on after-school and weekend programs during the school year or summer. In early 2021, the college applied for and was approved to offer a certificate program in UAS (Unmanned Aerial Systems) Designs, Applications, and Operations to college students by the Education Department of New York State. Later that year, the college also received a grant from the Federal Aviation Administration (FAA) to provide tuition-free early higher education for high school students, allowing them to complete the majority of the credits in the UAS certificate program while still enrolled in high school. The program aims to equip students with the hands-on skills necessary for successful careers as versatile engineers and technicians. Most of the courses in the certificate program are introductory or application-oriented, such as Introduction to Drones, Drone Law, Part 107 License, or Fundamentals of Land Surveying and Photogrammetry. However, one of the courses, Introduction to Drone Aeronautics, is more focused on the theory of drone flight and control. Organizing the lectures and laboratory of the course for high school students who are interested in pursuing the certificate can be a challenge. To create the Introduction to Drone Aeronautics course, a variety of school courses and online resources were examined. After careful consideration, the Robolink Co-drone [1] was chosen as the experimental platform for students to study drone flight, and control and stabilize a drone. However, developing a set of comprehensible lectures proved to be a difficult task. Based on the requirements of the certificate program, the lectures were designed to cover the following topics: (a) an overview of fundamentals of drone flight principles, including the forces acting on a drone such as lift, weight, drag, and thrust, as well as the selection of on-board components and trade-offs for proper payload and force balance; (b) an introduction to the proportional-integral-directive (PID) controller and its role in stabilizing a drone and reducing steady-state errors; (c) an explanation of the forces acting on a drone in different coordinates, along with coordinate transformations; and (d) an opportunity for students to examine the dynamic model of a 3D quadcopter with control parameters, but do not require them to derive the 3D drone dynamic equations. In the future, the course can be improved to cater to the diverse learning needs of the students. More interactive and accessible tools can be developed to help different types of students understand drone aeronautics. For instance, some students may prefer to apply mathematical skills to derive results, while others may find it easier to comprehend the stable flight of a drone by visualizing the continuous changes in forces and balances resulting from the control of DC motor speeds. Despite the differences in students’ mathematical abilities, the course has helped high school students appreciate that mathematics is a powerful tool for solving complex problems in the real world, rather than just a subject of abstract numbers. 
    more » « less
  3. Researchers, educators, and multimedia designers need to better understand how mixing physical tangible objects with virtual experiences affects learning and science identity. In this novel study, a 3D-printed tangible that is an accurate facsimile of the sort of expensive glassware that chemists use in real laboratories is tethered to a laptop with a digitized lesson. Interactive educational content is increasingly being placed online, it is important to understand the educational boundary conditions associated with passive haptics and 3D-printed manipulables. Cost-effective printed objects would be particularly welcome in rural and low Socio-Economic (SES) classrooms. A Mixed Reality (MR) experience was created that used a physical 3D-printed haptic burette to control a computer-based chemistry titration experiment. This randomized control trial study with 136 college students had two conditions: 1) low-embodied control (using keyboard arrows), and 2) high-embodied experimental (physically turning a valve/stopcock on the 3D-printed burette). Although both groups displayed similar significant gains on the declarative knowledge test, deeper analyses revealed nuanced Aptitude by Treatment Interactions (ATIs). These interactionsfavored the high-embodied experimental group that used the MR devicefor both titration-specific posttest knowledge questions and for science efficacy and science identity. Those students with higher prior science knowledge displayed higher titration knowledge scores after using the experimental 3D-printed haptic device. A multi-modal linguistic and gesture analysis revealed that during recall the experimental participants used the stopcock-turning gesture significantly more often, and their recalls created a significantly different Epistemic Network Analysis (ENA). ENA is a type of 2D projection of the recall data, stronger connections were seen in the high embodied group mainly centering on the key hand-turning gesture. Instructors and designers should consider the multi-modal and multi-dimensional nature of the user interface, and how the addition of another sensory-based learning signal (haptics) might differentially affect lower prior knowledge students. One hypothesis is that haptically manipulating novel devices during learning may create more cognitive load. For low prior knowledge students, it may be advantageous for them to begin learning content on a more ubiquitous interface (e.g., keyboard) before moving them to more novel, multi-modal MR devices/interfaces.

     
    more » « less
  4. Manipulation of deformable objects is a desired skill in making robots ubiquitous in manufacturing, service, healthcare, and security. Common deformable objects (e.g., wires, clothes, bed sheets, etc.) are significantly more difficult to model than rigid objects. In this research, we contribute to the model-based manipulation of linear flexible objects such as cables. We propose a 3D geometric model of the linear flexible object that is subject to gravity and a physical model with multiple links connected by revolute joints and identified model parameters. These models enable task automation in manipulating linear flexible objects both in simulation and real world. To bridge the gap between simulation and real world and build a close-to-reality simulation of flexible objects, we propose a new strategy called Simulation-to-Real-to-Simulation (Sim2Real2Sim). We demonstrate the feasibility of our approach by completing the Plug Task used in the 2015 DARPA Robotics Challenge Finals both in simulation and real world, which involves unplugging a power cable from one socket and plugging it into another. Numerical experiments are implemented to validate our approach. 
    more » « less
  5. While current vision algorithms excel at many challenging tasks, it is unclear how well they understand the physical dynamics of real-world environments. Here we introduce Physion, a dataset and benchmark for rigorously evaluating the ability to predict how physical scenarios will evolve over time. Our dataset features realistic simulations of a wide range of physical phenomena, including rigid and soft-body collisions, stable multi-object configurations, rolling, sliding, and projectile motion, thus providing a more comprehensive challenge than previous benchmarks. We used Physion to benchmark a suite of models varying in their architecture, learning objective, input-output structure, and training data. In parallel, we obtained precise measurements of human prediction behavior on the same set of scenarios, allowing us to directly evaluate how well any model could approximate human behavior. We found that vision algorithms that learn object-centric representations generally outperform those that do not, yet still fall far short of human performance. On the other hand, graph neural networks with direct access to physical state information both perform substantially better and make predictions that are more similar to those made by humans. These results suggest that extracting physical representations of scenes is the main bottleneck to achieving human-level and human-like physical understanding in vision algorithms. We have publicly released all data and code to facilitate the use of Physion to benchmark additional models in a fully reproducible manner, enabling systematic evaluation of progress towards vision algorithms that understand physical environments as robustly as people do. 
    more » « less