skip to main content


Search for: All records

Creators/Authors contains: "Dubois, Daniel J."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As remote work and learning increases in popularity, individuals, especially those with hearing impairments or who speak English as a second language, may depend on automated transcriptions to participate in business, school, entertainment, or basic communication. In this work, we investigate the automated transcription accuracy of seven popular social media and videoconferencing platforms with respect to some personal characteristics of their users, including gender, age, race, first language, speech rate, F0 frequency, and speech readability. We performed this investigation on a new corpus of 194 hours of English monologues by 846 TED talk speakers. Our results show the presence of significant bias, with transcripts less accurate for speakers that are male or non-native English speakers. We also observe differences in accuracy among platforms for different types of speakers. These results indicate that, while platforms have improved their automatic captioning, much work remains to make captions accessible for a wider variety of speakers and listeners.

     
    more » « less
    Free, publicly-accessible full text available May 31, 2025
  2. Consumer Internet of Things (IoT) devices are increasingly common, from smart speakers to security cameras, in homes. Along with their benefits come potential privacy and security threats. To limit these threats a number of commercial services have become available (IoT safeguards). The safeguards claim to provide protection against IoT privacy risks and security threats. However, the effectiveness and the associated privacy risks of these safeguards remains a key open question. In this paper, we investigate the threat detection capabilities of IoT safeguards for the first time. We develop and release an approach for automated safeguards experimentation to reveal their response to common security threats and privacy risks. We perform thousands of automated experiments using popular commercial IoT safeguards when deployed in a large IoT testbed. Our results indicate not only that these devices may be ineffective in preventing risks, but also their cloud interactions and data collection operations may introduce privacy risks for the households that adopt them. 
    more » « less
  3. Internet-of-Things (IoT) devices are ubiquitous, but little attention has been paid to how they may incorporate dark patterns despite consumer protections and privacy concerns arising from their unique access to intimate spaces and always-on capabilities. This paper conducts a systematic investigation of dark patterns in 57 popular, diverse smart home devices. We update manual interaction and annotation methods for the IoT context, then analyze dark pattern frequency across device types, manufacturers, and interaction modalities. We find that dark patterns are pervasive in IoT experiences, but manifest in diverse ways across device traits. Speakers, doorbells, and camera devices contain the most dark patterns, with manufacturers of such devices (Amazon and Google) having the most dark patterns compared to other vendors. We investigate how this distribution impacts the potential for consumer exposure to dark patterns, discuss broader implications for key stakeholders like designers and regulators, and identify opportunities for future dark patterns study. 
    more » « less
  4. null (Ed.)
    Abstract Despite the prevalence of Internet of Things (IoT) devices, there is little information about the purpose and risks of the Internet traffic these devices generate, and consumers have limited options for controlling those risks. A key open question is whether one can mitigate these risks by automatically blocking some of the Internet connections from IoT devices, without rendering the devices inoperable. In this paper, we address this question by developing a rigorous methodology that relies on automated IoT-device experimentation to reveal which network connections (and the information they expose) are essential, and which are not. We further develop strategies to automatically classify network traffic destinations as either required ( i.e. , their traffic is essential for devices to work properly) or not, hence allowing firewall rules to block traffic sent to non-required destinations without breaking the functionality of the device. We find that indeed 16 among the 31 devices we tested have at least one blockable non-required destination, with the maximum number of blockable destinations for a device being 11. We further analyze the destination of network traffic and find that all third parties observed in our experiments are blockable, while first and support parties are neither uniformly required or non-required. Finally, we demonstrate the limitations of existing blocklists on IoT traffic, propose a set of guidelines for automatically limiting non-essential IoT traffic, and we develop a prototype system that implements these guidelines. 
    more » « less
  5. null (Ed.)
    Internet of Things (IoT) devices are becoming increasingly popular and offer a wide range of services and functionality to their users. However, there are significant privacy and security risks associated with these devices. IoT devices can infringe users' privacy by ex-filtrating their private information to third parties, often without their knowledge. In this work we investigate the possibility to identify IoT devices and their location in an Internet Service Provider's network. By analyzing data from a large Internet Service Provider (ISP), we show that it is possible to recognize specific IoT devices, their vendors, and sometimes even their specific model, and to infer their location in the network. This is possible even with sparsely sampled flow data that are often the only datasets readily available at an ISP. We evaluate our proposed methodology to infer IoT devices at subscriber lines of a large ISP. Given ground truth information on IoT devices location and models, we were able to detect more than 77% of the studied IoT devices from sampled flow data in the wild. 
    more » « less
  6. Abstract Internet-connected voice-controlled speakers, also known as smart speakers , are increasingly popular due to their convenience for everyday tasks such as asking about the weather forecast or playing music. However, such convenience comes with privacy risks: smart speakers need to constantly listen in order to activate when the “wake word” is spoken, and are known to transmit audio from their environment and record it on cloud servers. In particular, this paper focuses on the privacy risk from smart speaker misactivations , i.e. , when they activate, transmit, and/or record audio from their environment when the wake word is not spoken. To enable repeatable, scalable experiments for exposing smart speakers to conversations that do not contain wake words, we turn to playing audio from popular TV shows from diverse genres. After playing two rounds of 134 hours of content from 12 TV shows near popular smart speakers in both the US and in the UK, we observed cases of 0.95 misactivations per hour, or 1.43 times for every 10,000 words spoken, with some devices having 10% of their misactivation durations lasting at least 10 seconds. We characterize the sources of such misactivations and their implications for consumers, and discuss potential mitigations. 
    more » « less