- Award ID(s):
- 1849997
- PAR ID:
- 10442721
- Date Published:
- Journal Name:
- 32nd USENIX Security Symposium (USENIX Security 23)
- Page Range / eLocation ID:
- 2419--2436
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
It is estimated that by the year 2024, the total number of systems equipped with voice assistant software will exceed 8.4 billion devices globally. While these devices provide convenience to consumers, they suffer from a myriad of security issues. This paper highlights the serious privacy threats exposed by information leakage in a smart assistant's encrypted network traffic metadata. To investigate this issue, we have collected a new dataset composed of dynamic and static commands posed to an Amazon Echo Dot using data collection and cleaning scripts we developed. Furthermore, we propose the Smart Home Assistant Malicious Ensemble model (SHAME) as the new state-of-the-art Voice Command Fingerprinting classifier. When evaluated against several datasets, our attack correctly classifies encrypted voice commands with up to 99.81% accuracy on Google Home traffic and 95.2% accuracy on Amazon Echo Dot traffic. These findings show that security measures must be taken to stop internet service providers, nation-states, and network eavesdroppers from monitoring our intimate conversations.more » « less
-
This paper presents the design and implementation of Scribe, a comprehensive voice processing and handwriting interface for voice assistants. Distinct from prior works, Scribe is a precise tracking interface that can co-exist with the voice interface on low sampling rate voice assistants. Scribe can be used for 3D free-form drawing, writing, and motion tracking for gaming. Taking handwriting as a specific application, it can also capture natural strokes and the individualized style of writing while occupying only a single frequency. The core technique includes an accurate acoustic ranging method called Cross Frequency Continuous Wave (CFCW) sonar, enabling voice assistants to use ultrasound as a ranging signal while using the regular microphone system of voice assistants as a receiver. We also design a new optimization algorithm that only requires a single frequency for time difference of arrival. Scribe prototype achieves 73 μm of median error for 1D ranging and 1.4 mm of median error in 3D tracking of an acoustic beacon using the microphone array used in voice assistants. Our implementation of an in-air handwriting interface achieves 94.1% accuracy with automatic handwriting-to-text software, similar to writing on paper (96.6%). At the same time, the error rate of voice-based user authentication only increases from 6.26% to 8.28%.
-
null (Ed.)People interacting with voice assistants are often frustrated by voice assistants' frequent errors and inability to respond to backchannel cues. We introduce an open-source video dataset of 21 participants' interactions with a voice assistant, and explore the possibility of using this dataset to enable automatic error recognition to inform self-repair. The dataset includes clipped and labeled videos of participants' faces during free-form interactions with the voice assistant from the smart speaker's perspective. To validate our dataset, we emulated a machine learning classifier by asking crowdsourced workers to recognize voice assistant errors from watching soundless video clips of participants' reactions. We found trends suggesting it is possible to determine the voice assistant's performance from a participant's facial reaction alone. This work posits elicited datasets of interactive responses as a key step towards improving error recognition for repair for voice assistants in a wide variety of applications.more » « less
-
Intelligent voice assistants, and the thirdparty apps (aka “skills” or “actions”) that power them, are increasing in popularity and beginning to experiment with the ability to continuously listen to users. This paper studies how privacy concerns related to such always-listening voice assistants might affect consumer behavior and whether certain privacy mitigations would render them more acceptable. To explore these questions with more realistic user choices, we built an interactive app store that allowed users to install apps for a hypothetical always-listening voice assistant. In a study with 214 participants, we asked users to browse the app store and install apps for different voice assistants that offered varying levels of privacy protections. We found that users were generally more willing to install continuously-listening apps when there were greater privacy protections, but this effect was not universally present. The majority did not review any permissions in detail, but still expressed a preference for stronger privacy protections. Our results suggest that privacy factors into user choice, but many people choose to skip this information.more » « less
-
Deceptive design patterns (sometimes called “dark patterns”) are user interface design elements that may trick, deceive, or mislead users into behaviors that often benefit the party implementing the design over the end user. Prior work has taxonomized, investigated, and measured the prevalence of such patterns primarily in visual user interfaces (e.g., on websites). However, as the ubiquity of voice assistants and other voice-assisted technologies increases, we must anticipate how deceptive designs will be (and indeed, are already) deployed in voice interactions. This paper makes two contributions towards characterizing and surfacing deceptive design patterns in voice interfaces. First, we make a conceptual contribution, identifying key characteristics of voice interfaces that may enable deceptive design patterns, and surfacing existing and theoretical examples of such patterns. Second, we present the findings from a scenario-based user survey with 93 participants, in which we investigate participants’ perceptions of voice interfaces that we consider to be both deceptive and non-deceptive.more » « less