skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: PrivacyOracle: Configuring Sensor Privacy Firewalls with Large Language Models in Smart Built Environments
Modern smart buildings and environments rely on sensory infrastructure to capture and process information about their inhabitants. However, it remains challenging to ensure that this infrastructure complies with privacy norms, preferences, and regulations; individuals occupying smart environments are often occupied with their tasks, lack awareness of the surrounding sensing mechanisms, and are non-technical experts. This problem is only exacerbated by the increasing number of sensors being deployed in these environments, as well as services seeking to use their sensory data. As a result, individuals face an unmanageable number of privacy decisions, preventing them from effectively behaving as their own “privacy firewall” for filtering and managing the multitude of personal information flows. These decisions often require qualitative reasoning over privacy regulations, understanding privacy-sensitive contexts, and applying various privacy transformations when necessary We propose the use of Large Language Models (LLMs), which have demonstrated qualitative reasoning over social/legal norms, sensory data, and program synthesis, all of which are necessary for privacy firewalls. We present PrivacyOracle, a prototype system for configuring privacy firewalls on behalf of users using LLMs, enabling automated privacy decisions in smart built environments. Our evaluation shows that PrivacyOracle achieves up to  more » « less
Award ID(s):
1705135
PAR ID:
10540443
Author(s) / Creator(s):
; ;
Publisher / Repository:
IEEE
Date Published:
ISSN:
2770-8411
ISBN:
979-8-3503-5487-4
Page Range / eLocation ID:
239 to 245
Format(s):
Medium: X
Location:
San Francisco, CA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Smart home cameras raise privacy concerns in part because they frequently collect data not only about the primary users who deployed them but also other parties -- who may be targets of intentional surveillance or incidental bystanders. Domestic employees working in smart homes must navigate a complex situation that blends privacy and social norms for homes, workplaces, and caregiving. This paper presents findings from 25 semi-structured interviews with domestic childcare workers in the U.S. about smart home cameras, focusing on how privacy considerations interact with the dynamics of their employer-employee relationships. We show how participants’ views on camera data collection, and their desire and ability to set conditions on data use and sharing, were affected by power differentials and norms about who should control information flows in a given context. Participants’ attitudes about employers’ cameras often hinged on how employers used the data; whether participants viewed camera use as likely to reinforce negative tendencies in the employer-employee relationship; and how camera use and disclosure might reflect existing relationship tendencies. We also suggest technical and social interventions to mitigate the adverse effects of power imbalances on domestic employees’ privacy and individual agency. 
    more » « less
  2. Web forms are one of the primary ways to collect personal information online, yet they are relatively under-studied. Unlike web tracking, data collection through web forms is explicit and contextualized. Users (i) are asked to input specific personal information types, and (ii) know the specific context (i.e., on which website and for what purpose). For web forms to be trusted by users, they must meet the common sense standards of appropriate data collection practices within a particular context (i.e., privacy norms). In this paper, we extract the privacy norms embedded within web forms through a measurement study. First, we build a specialized crawler to discover web forms on websites. We run it on 11,500 popular websites, and we create a dataset of 293K web forms. Second, to process data of this scale, we develop a cost-efficient way to annotate web forms with form types and personal information types, using text classifiers trained with assistance of large language models (LLMs). Third, by analyzing the annotated dataset, we reveal common patterns of data collection practices. We find that (i) these patterns are explained by functional necessities and legal obligations, thus reflecting privacy norms, and that (ii) deviations from the observed norms often signal unnecessary data collection. In addition, we analyze the privacy policies that accompany web forms. We show that, despite their wide adoption and use, there is a disconnect between privacy policy disclosures and the observed privacy norms. 
    more » « less
  3. Existing efforts on quantifying privacy implications for large language models (LLMs) solely focus on measuring leakage of training data. In this work, we shed light on the often-overlooked interactive settings where an LLM receives information from multiple sources and generates an output to be shared with other entities, creating the potential of exposing sensitive input data in inappropriate contexts. In these scenarios, humans nat- urally uphold privacy by choosing whether or not to disclose information depending on the context. We ask the question “Can LLMs demonstrate an equivalent discernment and reasoning capability when considering privacy in context?” We propose CONFAIDE, a benchmark grounded in the theory of contextual integrity and designed to identify critical weaknesses in the privacy reasoning capabilities of instruction-tuned LLMs. CONFAIDE consists of four tiers, gradually increasing in complexity, with the final tier evaluating contextual privacy reasoning and theory of mind capabilities. Our experiments show that even commercial models such as GPT-4 and ChatGPT reveal private information in contexts that humans would not, 39% and 57% of the time, respectively, highlighting the urgent need for a new direction of privacy-preserving approaches as we demonstrate a larger underlying problem stemmed in the models’ lack of reasoning capabilities. 
    more » « less
  4. With the increase in the number of privacy regulations, small development teams are forced to make privacy decisions on their own. In this paper, we conduct a mixed-method survey study, including statistical and qualitative analysis, to evaluate the privacy perceptions, practices, and knowledge of members involved in various phases of the Software Development Life Cycle (SDLC). Our survey includes 362 participants from 23 countries, encompassing roles such as product managers, developers, and testers. Our results show diverse definitions of privacy across SDLC roles, emphasizing the need for a holistic privacy approach throughout SDLC. We find that software teams, regardless of their region, are less familiar with privacy concepts (such as anonymization), relying on self-teaching and forums. Most participants are more familiar with GDPR and HIPAA than other regulations, with multijurisdictional compliance being their primary concern. Our results advocate the need for role-dependent solutions to address the privacy challenges, and we highlight research directions and educational takeaways to help improve privacy-aware SDLC. 
    more » « less
  5. null (Ed.)
    Cameras are everywhere, and are increasingly coupled with video analytics software that can identify our face, track our mood, recognize what we are doing, and more. We present the results of a 10-day in-situ study designed to understand how people feel about these capabilities, looking both at the extent to which they expect to encounter them as part of their everyday activities and at how comfortable they are with the presence of such technologies across a range of realistic scenarios. Results indicate that while some widespread deployments are expected by many (e.g., surveillance in public spaces), others are not, with some making people feel particularly uncomfortable. Our results further show that individuals’ privacy preferences and expectations are complicated and vary with a number of factors such as the purpose for which footage is captured and analyzed, the particular venue where it is captured, and whom it is shared with. Finally, we discuss the implications of people’s rich and diverse preferences on opt-in or opt-out rights for the collection and use (including sharing) of data associated with these video analytics scenarios as mandated by regulations. Because of the user burden associated with the large number of privacy decisions people could be faced with, we discuss how new types of privacy assistants could possibly be configured to help people manage these decisions. 
    more » « less