Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
To address privacy concerns with the Internet of Things (IoT) devices, researchers have proposed enhancements in data collection transparency and user control. However, managing privacy preferences for shared devices with multiple stakeholders remains challenging. We introduced ThingPoll, a system that helps users negotiate privacy configurations for IoT devices in shared settings. We designed ThingPoll by observing twelve participants verbally negotiating privacy preferences, from which we identified potentially successful and inefficient negotiation patterns. ThingPoll bootstraps a preference model from a custom crowdsourced privacy preferences dataset. During negotiations, ThingPoll strategically scaffolds the process by eliciting users’ privacy preferences, providing helpful contexts, and suggesting feasible configuration options. We evaluated ThingPoll with 30 participants negotiating the privacy settings of 4 devices. Using ThingPoll, participants reached an agreement in 97.5% of scenarios within an average of 3.27 minutes. Participants reported high overall satisfaction of 83.3% with ThingPoll as compared to baseline approaches.more » « lessFree, publicly-accessible full text available May 11, 2025
-
The U.S. Government is developing a package label to help consumers access reliable security and privacy information about Internet of Things (IoT) devices when making purchase decisions. The label will include the U.S. Cyber Trust Mark, a QR code to scan for more details, and potentially additional information. To examine how label information complexity and educational interventions affect comprehension of security and privacy attributes and label QR code use, we conducted an online survey with 518 IoT purchasers. We examined participants’ comprehension and preferences for three labels of varying complexities, with and without an educational intervention. Participants favored and correctly utilized the two higher-complexity labels, showing a special interest in the privacy-relevant content. Furthermore, while the educational intervention improved understanding of the QR code’s purpose, it had a modest effect on QR scanning behavior. We highlight clear design and policy directions for creating and deploying IoT security and privacy labels.more » « lessFree, publicly-accessible full text available May 11, 2025
-
Audio-based human activity recognition (HAR) is very popular because many human activities have unique sound signatures that can be detected using machine learning (ML) approaches. These audio-based ML HAR pipelines often use common featurization techniques, such as extracting various statistical and spectral features by converting time domain signals to the frequency domain (using an FFT) and using them to train ML models. Some of these approaches also claim privacy benefits by preventing the identification of human speech. However, recent deep learning-based automatic speech recognition (ASR) models pose new privacy challenges to these featurization techniques. In this paper, we systematically evaluate various featurization approaches for audio data, assessing their privacy risks through metrics like speech intelligibility (PER and WER) while considering the utility tradeoff in terms of ML-based activity recognition accuracy. Our findings reveal the susceptibility of these approaches to speech content recovery when exposed to recent ASR models, especially under re-tuning or retraining conditions. Notably, fine-tuned ASR models achieved an average Phoneme Error Rate (PER) of 39.99% and Word Error Rate (WER) of 44.43% in speech recognition for these approaches. To overcome these privacy concerns, we propose Kirigami, a lightweight machine learning-based audio speech filter that removes human speech segments reducing the efficacy of ASR models (70.48% PER and 101.40% WER) while also maintaining HAR accuracy (76.0% accuracy). We show that Kirigami can be implemented on common edge microcontrollers with limited computational capabilities and memory, providing a path to deployment on a variety of IoT devices. Finally, we conducted a real-world user study and showed the robustness of Kirigami on a laptop and an ARM Cortex-M4F microcontroller under three different background noises.more » « lessFree, publicly-accessible full text available March 6, 2025
-
The use of audio and video modalities for Human Activity Recognition (HAR) is common, given the richness of the data and the availability of pre-trained ML models using a large corpus of labeled training data. However, audio and video sensors also lead to significant consumer privacy concerns. Researchers have thus explored alternate modalities that are less privacy-invasive such as mmWave doppler radars, IMUs, motion sensors. However, the key limitation of these approaches is that most of them do not readily generalize across environments and require significant in-situ training data. Recent work has proposed cross-modality transfer learning approaches to alleviate the lack of trained labeled data with some success. In this paper, we generalize this concept to create a novel system called VAX (Video/Audio to 'X'), where training labels acquired from existing Video/Audio ML models are used to train ML models for a wide range of 'X' privacy-sensitive sensors. Notably, in VAX, once the ML models for the privacy-sensitive sensors are trained, with little to no user involvement, the Audio/Video sensors can be removed altogether to protect the user's privacy better. We built and deployed VAX in ten participants' homes while they performed 17 common activities of daily living. Our evaluation results show that after training, VAX can use its onboard camera and microphone to detect approximately 15 out of 17 activities with an average accuracy of 90%. For these activities that can be detected using a camera and a microphone, VAX trains a per-home model for the privacy-preserving sensors. These models (average accuracy = 84%) require no in-situ user input. In addition, when VAX is augmented with just one labeled instance for the activities not detected by the VAX A/V pipeline (~2 out of 17), it can detect all 17 activities with an average accuracy of 84%. Our results show that VAX is significantly better than a baseline supervised-learning approach of using one labeled instance per activity in each home (average accuracy of 79%) since VAX reduces the user burden of providing activity labels by 8x (~2 labels vs. 17 labels).
-
Translating fine-grained activity detection (e.g., phone ring, talking interspersed with silence and walking) into semantically meaningful and richer contextual information (e.g., on a phone call for 20 minutes while exercising) is essential towards enabling a range of healthcare and human-computer interaction applications. Prior work has proposed building ontologies or temporal analysis of activity patterns with limited success in capturing complex real-world context patterns. We present TAO, a hybrid system that leverages OWL-based ontologies and temporal clustering approaches to detect high-level contexts from human activities. TAO can characterize sequential activities that happen one after the other and activities that are interleaved or occur in parallel to detect a richer set of contexts more accurately than prior work. We evaluate TAO on real-world activity datasets (Casas and Extrasensory) and show that our system achieves, on average, 87% and 80% accuracy for context detection, respectively. We deploy and evaluate TAO in a real-world setting with eight participants using our system for three hours each, demonstrating TAO's ability to capture semantically meaningful contexts in the real world. Finally, to showcase the usefulness of contexts, we prototype wellness applications that assess productivity and stress and show that the wellness metrics calculated using contexts provided by TAO are much closer to the ground truth (on average within 1.1%), as compared to the baseline approach (on average within 30%).
-
Third-party dependencies expose websites to shared risks and cascading failures. The dependencies impact African websites as well e.g., Afrihost outage in 2022 [15]. While the prevalence of third-party dependencies has been studied for globally popular websites, Africa is largely underrepresented in those studies. Hence, this work analyzes the prevalence of third-party infrastructure dependencies in Africa-centric websites from 4 African vantage points. We consider websites that fall into one of the four categories: Africa-visited (popular in Africa) Africa-hosted (sites hosted in Africa), Africa-dominant (sites targeted towards users in Africa), and Africa-operated (websites operated in Africa). Our key findings are: 1) 93% of the Africa-visited websites critically depend on a third-party DNS, CDN, or CA. In perspective, US-visited websites are up to 25% less critically dependent. 2) 97% of Africa-dominant, 96% of Africa-hosted, and 95% of Africa-operated websites are critically dependent on a third-party DNS, CDN, or CA provider. 3) The use of third-party services is concentrated where only 3 providers can affect 60% of the Africa-centric websites. Our findings have key implications for the present usage and recommendations for the future evolution of the Internet in Africa.more » « less
-
There is increasing interest in deploying building-scale, general-purpose, and high-fidelity sensing to drive emerging smart building applications. However, the real-world deployment of such systems is challenging due to the lack of system and architectural support. Most existing sensing systems are purpose-built, consisting of hardware that senses a limited set of environmental facets, typically at low fidelity and for short-term deployment. Furthermore, prior systems with high-fidelity sensing and machine learning fail to scale effectively and have fewer primitives, if any, for privacy and security. For these reasons, IoT deployments in buildings are generally short-lived or done as a proof of concept. We present the design of Mites, a scalable end-to-end hardware-software system for supporting and managing distributed general-purpose sensors in buildings. Our design includes robust primitives for privacy and security, essential features for scalable data management, as well as machine learning to support diverse applications in buildings. We deployed our Mites system and 314 Mites devices in Tata Consultancy Services (TCS) Hall at Carnegie Mellon University (CMU), a fully occupied, five-story university building. We present a set of comprehensive evaluations of our system using a series of microbenchmarks and end-to-end evaluations to show how we achieved our stated design goals. We include five proof-of-concept applications to demonstrate the extensibility of the Mites system to support compelling IoT applications. Finally, we discuss the real-world challenges we faced and the lessons we learned over the five-year journey of our stack's iterative design, development, and deployment.
-
As Internet-of-Things (IoT) devices rapidly gain popularity, they raise significant privacy concerns given the breadth of sensitive data they can capture. These concerns are amplified by the fact that in many situations, IoT devices collect data about people other than their owner or administrator, and these stakeholders have no say in how that data is managed, used, or shared. To address this, we propose a new model of ownership, IoT Ephemeral Ownership (TEO). TEO allows stakeholders to quickly register with an IoT device for a limited period, and thus claim co-ownership over the sensitive data that the device generates. Device admins retain the ability to decide who may become an ephemeral owner, but no longer have access or control to the private data generated by the device. The encrypted data in TEO is accessible only by entities after seeking explicit permission from the different co-owners of that data. We verify the key security properties of our protocol underpinning TEO in the symbolic model using ProVerif. We also implement a cross-platform prototype of TEO for mobile phones and embedded devices, and integrate it into three real-world application case studies. Our evaluation shows that the latency and battery impact of TEO is typically small, adding ≤ 187 ms onto one-time operations, and introducing limited (<25%) overhead on recurring operations like private data storage.more » « less
-
Apple announced the introduction of app privacy details to their App Store in December 2020, marking the frst ever real-world, large-scale deployment of the privacy nutrition label concept, which had been introduced by researchers over a decade earlier. The Apple labels are created by app developers, who self-report their app’s data practices. In this paper, we present the frst study examining the usability and understandability of Apple’s privacy nutrition label creation process from the developer’s perspective. By observing and interviewing 12 iOS app developers about how they created the privacy label for a real-world app that they developed, we identified common challenges for correctly and efciently creating privacy labels. We discuss design implications both for improving Apple’s privacy label design and for future deployment of other standardized privacy notices.more » « less