skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 25, 2026

Title: Evaluating User Preferences in Sharing Sensitive Information via Telepresence Robots
This paper explores user preferences for sharing sensitive information via telepresence robots using six input methods: pen & paper, smartphone, robot display, speech, whisper, and silent speech. Through a crowdsourced survey and a follow-up user study, it identifies key differences in effort, convenience, privacy, security, and social acceptability. Speech is perceived as the easiest but least secure method, while pen & paper, initially favored, proves inconvenient in practice. Robot display and smartphone consistently rank as the most secure, private, and socially acceptable. Silent speech emerges as a strong alternative, offering greater privacy than other speech-based methods. These findings highlight the need for telepresence robots to support multiple input methods to accommodate diverse user needs and privacy concerns.  more » « less
Award ID(s):
2239633
PAR ID:
10638001
Author(s) / Creator(s):
; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400714023
Page Range / eLocation ID:
386 to 395
Subject(s) / Keyword(s):
Human-Robot Interaction Privacy Security Speech Whisper Silent Speech Smartphone Optical Character Recognition
Format(s):
Medium: X
Location:
Corfu Island Greece
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract In this article, we present a live speech-driven, avatar-mediated, three-party telepresence system, through which three distant users, embodied as avatars in a shared 3D virtual world, can perform natural three-party telepresence that does not require tracking devices. Based on live speech input from three users, this system can real-time generate the corresponding conversational motions of all the avatars, including head motion, eye motion, lip movement, torso motion, and hand gesture. All motions are generated automatically at each user side based on live speech input, and a cloud server is utilized to transmit and synchronize motion and speech among different users. We conduct a formal user study to evaluate the usability and effectiveness of the system by comparing it with a well-known online virtual world, Second Life, and a widely-used online teleconferencing system, Skype. The user study results indicate our system can provide a measurably better telepresence user experience than the two widely-used methods. 
    more » « less
  2. Augmented reality (AR), which overlays virtual content on top of the user’s perception of the real world, has now begun to enter the consumer market. Besides smartphone platforms, early-stage head-mounted displays such as the Microsoft HoloLens are under active development. Many compelling uses of these technologies are multi-user: e.g., inperson collaborative tools, multiplayer gaming, and telepresence. While prior work on AR security and privacy has studied potential risks from AR applications, new risks will also arise among multiple human users. In this work, we explore the challenges that arise in designing secure and private content sharing for multi-user AR. We analyze representative application case studies and systematize design goals for security and functionality that a multi-user AR platform should support. We design an AR content sharing control module that achieves these goals and build a prototype implementation (ShareAR) for the HoloLens. This work builds foundations for secure and private multi-user AR interactions. 
    more » « less
  3. Silent speech is unaffected by ambient noise, increases accessibility, and enhances privacy and security. Yet current silent speech recognizers operate in a phrase-in/phrase-out manner, thus are slow, error prone, and impractical for mobile devices. We present MELDER, a Mobile Lip Reader that operates in real-time by splitting the input video into smaller temporal segments to process them individually. An experiment revealed that this substantially improves computation time, making it suitable for mobile devices. We further optimize the model for everyday use by exploiting the knowledge from a high-resource vocabulary using a transfer learning model. We then compare MELDER in both stationary and mobile settings with two state-of-the-art silent speech recognizers, where MELDER demonstrated superior overall performance. Finally, we compare two visual feedback methods of MELDER with the visual feedback method of Google Assistant. The outcomes shed light on how these proposed feedback methods influence users' perceptions of the model's performance. 
    more » « less
  4. While the ultimate goal of natural-language based Human-Robot Interaction (HRI) may be free-form, mixed-initiative dialogue,social robots deployed in the near future will likely primarily engage in wakeword-driven interaction, in which users’ commands are prefaced by a wakeword such as “Hey, Robot.” This style of interaction helps to allay user privacy concerns, as the robot’s full speech recognition module need not be employed until the target wakeword is used. Unfortunately, there are a number of concerns in the popular media surrounding this style of interaction, with consumers fearing that it is training users (in particular,children) to be rude towards technology, and by extension, rude towards other humans. In this paper, we present a study that demonstrates how an alternate style of wakeword, i.e., “Excuse me, Robot” may allay this concern, by priming users to phrase commands as Indirect Speech Acts 
    more » « less
  5. Abstract Silent speech interfaces offer an alternative and efficient communication modality for individuals with voice disorders and when the vocalized speech communication is compromised by noisy environments. Despite the recent progress in developing silent speech interfaces, these systems face several challenges that prevent their wide acceptance, such as bulkiness, obtrusiveness, and immobility. Herein, the material optimization, structural design, deep learning algorithm, and system integration of mechanically and visually unobtrusive silent speech interfaces are presented that can realize both speaker identification and speech content identification. Conformal, transparent, and self‐adhesive electromyography electrode arrays are designed for capturing speech‐relevant muscle activities. Temporal convolutional networks are employed for recognizing speakers and converting sensing signals into spoken content. The resulting silent speech interfaces achieve a 97.5% speaker classification accuracy and 91.5% keyword classification accuracy using four electrodes. The speech interface is further integrated with an optical hand‐tracking system and a robotic manipulator for human‐robot collaboration in both assembly and disassembly processes. The integrated system achieves the control of the robot manipulator by silent speech and facilitates the hand‐over process by hand motion trajectory detection. The developed framework enables natural robot control in noisy environments and lays the ground for collaborative human‐robot tasks involving multiple human operators. 
    more » « less