skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Smart Speaker Command Dataset
This dataset contains a collection of voice commands for a smart speaker, each beginning with the common wake-word "Hey Alexa". The commands cover a range of tasks such as music control, smart home management, information requests, reminders, shopping, entertainment, and communication. The dataset reflects natural language usage from a diverse group of speakers, capturing various phrasings, inflections, and contexts. It includes contributions from both male and female voices and features speakers with different native languages.If you plan to download this dataset, we would appreciate it very much if you could fill out the Google form at https://forms.gle/dixQ4mkZ4xbXtXRDA. This will help us understand the usage and impacts of this dataset. Your feedback will also help us improve any future extensions of this work.  more » « less
Award ID(s):
2104758
PAR ID:
10534099
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
University of Rochester
Date Published:
Subject(s) / Keyword(s):
Electrical engineering Acoustics and acoustical devices waves Smart Speakers Wake word speech
Format(s):
Medium: X Size: 470477917 Bytes
Size(s):
470477917 Bytes
Right(s):
Creative Commons Attribution 4.0 International
Sponsoring Org:
National Science Foundation
More Like this
  1. The dataset includes impulse responses recorded from 14 different rooms. Each room has unique acoustic properties, providing a wide range of RT60, clarity, and EDT values. The recordings are in 48kHz, 32bit, mono WAV files. The dataset is organized by room, with each subfolder containing the impulse responses specific to that room, as well as a general layout of each room and plots of acoustic data.This dataset supports Estimating direction of arrival in reverberant environments for wake-word detection using a single structural vibration sensor, published in the Journal of the Acoustical Society of America, Vol. 156, Iss. 4, October, 2024.If you plan to download this dataset, we would appreciate it very much if you could fill out the Google form at https://forms.gle/jnuP2dYRK3CPmXQG6. This will help us understand the usage and impacts of this dataset. Your feedback will also help us improve any future extensions of this work. 
    more » « less
  2. It is estimated that by the year 2024, the total number of systems equipped with voice assistant software will exceed 8.4 billion devices globally. While these devices provide convenience to consumers, they suffer from a myriad of security issues. This paper highlights the serious privacy threats exposed by information leakage in a smart assistant's encrypted network traffic metadata. To investigate this issue, we have collected a new dataset composed of dynamic and static commands posed to an Amazon Echo Dot using data collection and cleaning scripts we developed. Furthermore, we propose the Smart Home Assistant Malicious Ensemble model (SHAME) as the new state-of-the-art Voice Command Fingerprinting classifier. When evaluated against several datasets, our attack correctly classifies encrypted voice commands with up to 99.81% accuracy on Google Home traffic and 95.2% accuracy on Amazon Echo Dot traffic. These findings show that security measures must be taken to stop internet service providers, nation-states, and network eavesdroppers from monitoring our intimate conversations. 
    more » « less
  3. Voice controlled interactive smart speakers, such as Google Home, Amazon Echo, and Apple HomePod are becoming commonplace in today's homes. These devices listen continually for the user commands, that are triggered by special keywords, such as "Alexa" and "Hey Siri". Recent research has shown that these devices are vulnerable to attacks through malicious voice commands from nearby devices. The commands can be sent easily during unoccupied periods, so that the user may be unaware of such attacks. We present EchoSafe, a user-friendly sonar-based defense against these attacks. When the user sends a critical command to the smart speaker, EchoSafe sends an audio pulse followed by post processing to determine if the user is present in the room. We can detect the user's presence during critical commands with 93.13% accuracy, and our solution can be extended to defend against other attack scenarios, as well. 
    more » « less
  4. Extensive recent research has shown that it is surprisingly easy to infer Amazon Alexa voice commands over their network traffic data. To prevent these traffic analytics (TA)-based inference attacks, smart home owners are considering deploying virtual private networks (VPNs) to safeguard their smart speakers. In this work, we design a new machine learning-powered attack framework—VoiceAttack that could still accurately fingerprint voice commands on VPN-encrypted voice speaker network traffic. We evaluate VoiceAttack under 5 different real-world settings using Amazon Alexa and Google Home. Our results show that VoiceAttack could correctly infer voice command sentences with a Matthews Correlation Coefficient (MCC) of 0.68 in a closed-world setting and infer voice command categories with an MCC of 0.84 in an open-world setting by eavesdropping VPN-encrypted network traffic data. This presents a significant risk to user privacy and security, as it suggests that external on-path attackers could still potentially intercept and decipher users’ voice commands despite the VPN encryption. We then further examine the sensitivity of voice speaker commands to VoiceAttack. We find that 134 voice speaker commands are highly vulnerable to VoiceAttack. We also present a defense approach—VoiceDefense, which could inject inject appropriate traffic “noise” into voice speaker traffic. And our evaluation results show that VoiceDefense could effectively mitigate VoiceAttack on Amazon Echo and Google Home. 
    more » « less
  5. Abstract Internet-connected voice-controlled speakers, also known as smart speakers , are increasingly popular due to their convenience for everyday tasks such as asking about the weather forecast or playing music. However, such convenience comes with privacy risks: smart speakers need to constantly listen in order to activate when the “wake word” is spoken, and are known to transmit audio from their environment and record it on cloud servers. In particular, this paper focuses on the privacy risk from smart speaker misactivations , i.e. , when they activate, transmit, and/or record audio from their environment when the wake word is not spoken. To enable repeatable, scalable experiments for exposing smart speakers to conversations that do not contain wake words, we turn to playing audio from popular TV shows from diverse genres. After playing two rounds of 134 hours of content from 12 TV shows near popular smart speakers in both the US and in the UK, we observed cases of 0.95 misactivations per hour, or 1.43 times for every 10,000 words spoken, with some devices having 10% of their misactivation durations lasting at least 10 seconds. We characterize the sources of such misactivations and their implications for consumers, and discuss potential mitigations. 
    more » « less