skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 1955227

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. EU data localization regulations limit data transfers to non-EU countries with the GDPR. However, BGP, DNS and other Internet protocols were not designed to enforce jurisdictional constraints, so implementing data localization is challenging. Despite initial research on the topic, little is known about if or how companies currently operate their server infrastructure to comply with the regulations. We close this knowledge gap by empirically measuring the extent to which servers and routers that process EU requests are located outside of the EU (and a handful of 'adequate' non-EU countries). The key challenge is that both browser measurements (to infer relevant endpoints) and data-plane measurements (to infer relevant IP addresses) are needed, but no large-scale public infrastructure allows both. We build a novel methodology that combines BrightData (browser) and RIPE Atlas (data-plane) probes, with joint measurements from over 1,000 networks in 20 EU countries. We find that, on average, 2.2% of servers serving users in each EU country are located in non-adequate destination countries (1.4% of known trackers). Our findings suggest that data localization policies are largely being followed by content providers, though there are exceptions. 
    more » « less
    Free, publicly-accessible full text available July 1, 2026
  2. Free, publicly-accessible full text available April 25, 2026
  3. Many companies, including Google, Amazon, and Apple, offer voice assistants as a convenient solution for answering general voice queries and accessing their services. These voice assistants have gained popularity and can be easily accessed through various smart devices such as smartphones, smart speakers, smartwatches, and an increasing array of other devices. However, this convenience comes with potential privacy risks. For instance, while companies vaguely mention in their privacy policies that they may use voice interactions for user profiling, it remains unclear to what extent this profiling occurs and whether voice interactions pose greater privacy risks compared to other interaction modalities. In this paper, we conduct 1171 experiments involving 24530 queries with different personas and interaction modalities during 20 months to characterize how the three most popular voice assistants profile their users. We analyze factors such as labels assigned to users, their accuracy, the time taken to assign these labels, differences between voice and web interactions, and the effectiveness of profiling remediation tools offered by each voice assistant. Our findings reveal that profiling can happen without interaction, can be incorrect and inconsistent at times, may take several days or weeks to change, and is affected by the interaction modality. 
    more » « less
    Free, publicly-accessible full text available April 1, 2026
  4. In recent years, gig work platforms have gained popularity as a way for individuals to earn money; as of 2021, 16% of Americans have at some point earned money from such platforms. Despite their popularity and their history of unfair data collection practices and worker safety, little is known about the data collected from workers (and users) by gig platforms and about the privacy dark pattern designs present in their apps. This paper presents an empirical measurement of 16 gig work platforms' data practices in the U.S. We analyze what data is collected by these platforms, and how it is shared and used. Finally, we consider how these practices constitute privacy dark patterns. To that end, we develop a novel combination of methods to address gig-worker-specific challenges in experimentation and data collection, enabling the largest in-depth study of such platforms to date. We find extensive data collection and sharing with 60 third parties—including sharing reversible hashes of worker Social Security Numbers (SSNs)—along with dark patterns that subject workers to greater privacy risk and opportunistically use collected data to nag workers in off-platform messages. We conclude this paper with proposed interdisciplinary mitigations for improving gig worker privacy protections. After we disclosed our SSN-related findings to affected platforms, the platforms confirmed that the issue had been mitigated. This is consistent with our independent audit of the affected platforms. Analysis code and redacted datasets will be made available to those who wish to reproduce our findings. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026
  5. Detailed targeting of advertisements has long been one of the core offerings of online platforms. Unfortunately, malicious advertisers have frequently abused such targeting features, with results that range from violating civil rights laws to driving division, polarization, and even social unrest. Platforms have often attempted to mitigate this behavior by removing targeting attributes deemed problematic, such as inferred political leaning, religion, or ethnicity. In this work, we examine the effectiveness of these mitigations by collecting data from political ads placed on Facebook in the lead up to the 2022 U.S. midterm elections. We show that major political advertisers circumvented these mitigations by targeting proxy attributes: seemingly innocuous targeting criteria that closely correspond to political and racial divides in American society. We introduce novel methods for directly measuring the skew of various targeting criteria to quantify their effectiveness as proxies, and then examine the scale at which those attributes are used. Our findings have crucial implications for the ongoing discussion on the regulation of political advertising and emphasize the urgency for increased transparency. 
    more » « less
    Free, publicly-accessible full text available November 8, 2025
  6. Free, publicly-accessible full text available November 4, 2025
  7. As remote work and learning increases in popularity, individuals, especially those with hearing impairments or who speak English as a second language, may depend on automated transcriptions to participate in business, school, entertainment, or basic communication. In this work, we investigate the automated transcription accuracy of seven popular social media and videoconferencing platforms with respect to some personal characteristics of their users, including gender, age, race, first language, speech rate, F0 frequency, and speech readability. We performed this investigation on a new corpus of 194 hours of English monologues by 846 TED talk speakers. Our results show the presence of significant bias, with transcripts less accurate for speakers that are male or non-native English speakers. We also observe differences in accuracy among platforms for different types of speakers. These results indicate that, while platforms have improved their automatic captioning, much work remains to make captions accessible for a wider variety of speakers and listeners. 
    more » « less
  8. With an increasing number of Internet of Things (IoT) devices present in homes, there is a rise in the number of potential infor- mation leakage channels and their associated security threats and privacy risks. Despite a long history of attacks on IoT devices in unprotected home networks, the problem of accurate, rapid detection and prevention of such attacks remains open. Many existing IoT protection solutions are cloud-based, sometimes ineffective, and might share consumer data with unknown third parties. This paper investigates the potential for effective IoT threat detection locally, on a home router, using AI tools combined with classic rule-based traffic-filtering algorithms. Our results show that with a slight rise of router hardware resources caused by machine learn- ing and traffic filtering logic, a typical home router instrumented with our solution is able to effectively detect risks and protect a typical home IoT network, equaling or outperforming existing popular solutions, with- out any effects on benign IoT functionality, and without relying on cloud services and third parties. 
    more » « less