skip to main content


Title: Teaching Cameras to Feel: Estimating Tactile Physical Properties of Surfaces from Images
The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we introduce a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property.  more » « less
Award ID(s):
1715195
NSF-PAR ID:
10292301
Author(s) / Creator(s):
Date Published:
Journal Name:
European Conference on Computer Vision
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Tactile graphics are a common way to present information to people with vision impairments. Tactile graphics can be used to explore a broad range of static visual content but aren’t well suited to representing animation or interactivity. We introduce a new approach to creating dynamic tactile graphics that combines a touch screen tablet, static tactile overlays, and small mobile robots. We introduce a prototype system called RoboGraphics and several proof-of-concept applications. We evaluated our prototype with seven participants with varying levels of vision, comparing the RoboGraphics approach to a flat screen, audio-tactile interface. Our results show that dynamic tactile graphics can help visually impaired participants explore data quickly and accurately. 
    more » « less
  2. null (Ed.)
    A major issue with upper limb prostheses is the disconnect between sensory information perceived by the user and the information perceived by the prosthesis. Advances in prosthetic technology introduced tactile information for monitoring grasping activity, but visual information, a vital component in the human sensory system, is still not fully utilized as a form of feedback to the prosthesis. For able-bodied individuals, many of the decisions for grasping or manipulating an object, such as hand orientation and aperture, are made based on visual information before contact with the object. We show that inclusion of neuromorphic visual information, combined with tactile feedback, improves the ability and efficiency of both able-bodied and amputee subjects to pick up and manipulate everyday objects.We discovered that combining both visual and tactile information in a real-time closed loop feedback strategy generally decreased the completion time of a task involving picking up and manipulating objects compared to using a single modality for feedback. While the full benefit of the combined feedback was partially obscured by experimental inaccuracies of the visual classification system, we demonstrate that this fusion of neuromorphic signals from visual and tactile sensors can provide valuable feedback to a prosthetic arm for enhancing real-time function and usability. 
    more » « less
  3. Loop closure detection is a fundamental problem for simultaneous localization and mapping (SLAM) in robotics. Most of the previous methods only consider one type of information, based on either visual appearances or spatial relationships of landmarks. In this paper, we introduce a novel visual-spatial information preserving multi-order graph matching approach for long-term loop closure detection. Our approach constructs a graph representation of a place from an input image to integrate visual-spatial information, including visual appearances of the landmarks and the background environment, as well as the second and third-order spatial relationships between two and three landmarks, respectively. Furthermore, we introduce a new formulation that formulates loop closure detection as a multi-order graph matching problem to compute a similarity score directly from the graph representations of the query and template images, instead of performing conventional vector-based image matching. We evaluate the proposed multi-order graph matching approach based on two public long-term loop closure detection benchmark datasets, including the St. Lucia and CMU-VL datasets. Experimental results have shown that our approach is effective for long-term loop closure detection and it outperforms the previous state-of-the-art methods. 
    more » « less
  4. Conventional Intelligent Virtual Agents (IVAs) focus primarily on the visual and auditory channels for both the agent and the interacting human: the agent displays a visual appearance and speech as output, while processing the human’s verbal and non-verbal behavior as input. However, some interactions, particularly those between a patient and healthcare provider, inherently include tactile components.We introduce an Intelligent Physical-Virtual Agent (IPVA) head that occupies an appropriate physical volume; can be touched; and via human-in-the-loop control can change appearance, listen, speak, and react physiologically in response to human behavior. Compared to a traditional IVA, it provides a physical affordance, allowing for more realistic and compelling human-agent interactions. In a user study focusing on neurological assessment of a simulated patient showing stroke symptoms, we compared the IPVA head with a high-fidelity touch-aware mannequin that has a static appearance. Various measures of the human subjects indicated greater attention, affinity for, and presence with the IPVA patient, all factors that can improve healthcare training. 
    more » « less
  5. Tactile sensing has been increasingly utilized in robot control of unknown objects to infer physical properties and optimize manipulation. However, there is limited understanding about the contribution of different sensory modalities during interactive perception in complex interaction both in robots and in humans. This study investigated the effect of visual and haptic information on humans’ exploratory interactions with a ‘cup of coffee’, an object with nonlinear internal dynamics. Subjects were instructed to rhythmically transport a virtual cup with a rolling ball inside between two targets at a specified frequency, using a robotic interface. The cup and targets were displayed on a screen, and force feedback from the cup-andball dynamics was provided via the robotic manipulandum. Subjects were encouraged to explore and prepare the dynamics by “shaking” the cup-and-ball system to find the best initial conditions prior to the task. Two groups of subjects received the full haptic feedback about the cup-and-ball movement during the task; however, for one group the ball movement was visually occluded. Visual information about the ball movement had two distinctive effects on the performance: it reduced preparation time needed to understand the dynamics and, importantly, it led to simpler, more linear input-output interactions between hand and object. The results highlight how visual and haptic information regarding nonlinear internal dynamics have distinct roles for the interactive perception of complex objects. 
    more » « less