skip to main content


Title: Are Anonymity-Seekers Just like Everybody Else? An Analysis of Contributions to Wikipedia from Tor
User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users.  more » « less
Award ID(s):
1931005 2031951
NSF-PAR ID:
10176334
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEEE Symposium on Security and Privacy (SP)
Volume:
1
Page Range / eLocation ID:
974–90
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users. 
    more » « less
  2. Search engines are some of the most popular and profitable intelligent technologies in existence. Recent research, however, has suggested that search engines may be surprisingly dependent on user-created content like Wikipedia articles to address user information needs. In this paper, we perform a rigorous audit of the extent to which Google leverages Wikipedia and other user-generated content to respond to queries. Analyzing results for six types of important queries (e.g. most popular, trending, expensive advertising), we observe that Wikipedia appears in over 80% of results pages for some query types and is by far the most prevalent individual content source across all query types. More generally, our results provide empirical information to inform a nascent but rapidly-growing debate surrounding a highly consequential question: Do users provide enough value to intelligent technologies that they should receive more of the economic benefits from intelligent technologies? 
    more » « less
  3. Over 8 million users rely on the Tor network each day to protect their anonymity online. Unfortunately, Tor has been shown to be vulnerable to the website fingerprinting attack, which allows an attacker to deduce the website a user is visiting based on patterns in their traffic. The state-of-the-art attacks leverage deep learning to achieve high classification accuracy using raw packet information. Work thus far, however, has examined only one type of media delivered over the Tor network: web pages, and mostly just home pages of sites. In this work, we instead investigate the fingerprintability of video content served over Tor. We collected a large new dataset of network traces for 50 YouTube videos of similar length. Our preliminary experiments utilizing a convolutional neural network model proposed in prior works has yielded promising classification results, achieving up to 55% accuracy. This shows the potential to unmask the individual videos that users are viewing over Tor, creating further privacy challenges to consider when defending against website fingerprinting attacks. 
    more » « less
  4. The introduction of block-based programming has gradually changed the landscape of programming education, particularly for school children. Block languages today, however, have serious technical barriers to students with disabilities. For example, block languages are generally not screen reader accessible, incompatible with braille, and contain serious problems for users with motor impairments. No student with a disability should ever be denied access to learning computer science and they do not have to be. To help rectify this, we present a new approach to the design of block languages called Quorum Blocks. Quorum Blocks uses a custom hardware accelerated graphical rendering pipeline that takes into account how screen readers and other devices work under the hood. We discuss these technical details and demonstrate that accessibility support can be fully achieved without meaningfully losing either the look of modern blocks or their visual output. We present the results from focus groups that highlight the barriers students faced with a variety of disabilities when using the first version of Quorum Blocks. We focus especially on challenges with low vision users, screen reader users, or those using no mouse and only one hand to type. Block languages built using either our techniques, or on top of our libraries, would become accessible out of the box. 
    more » « less
  5. null (Ed.)
    Content moderation is a critical service performed by a variety of people on social media, protecting users from offensive or harmful content by reviewing and removing either the content or the perpetrator. These moderators fall into one of two categories: employees or volunteers. Prior research has suggested that there are differences in the effectiveness of these two types of moderators, with the more transparent user-based moderation being useful for educating users. However, direct comparisons between commercially-moderated and user-moderated platforms are rare, and apart from the difference in transparency, we still know little about what other disparities in user experience these two moderator types may create. To explore this, we conducted cross-platform surveys of over 900 users of commercially-moderated (Facebook, Instagram, Twitter, and YouTube) and user-moderated (Reddit and Twitch) social media platforms. Our results indicated that although user-moderated platforms did seem to be more transparent than commercially-moderated ones, this did not lead to user-moderated platforms being perceived as less toxic. In addition, commercially-moderated platform users want companies to take more responsibility for content moderation than they currently do, while user-moderated platform users want designated moderators and those who post on the site to take more responsibility. Across platforms, users seem to feel powerless and want to be taken care of when it comes to content moderation as opposed to engaging themselves. 
    more » « less