skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Characterizing Online Vandalism: A Rational Choice Perspective
What factors influence the decision to vandalize? Although the harm is clear, the benefit to the vandal is less clear. In many cases, the thing being damaged may itself be something the vandal uses or enjoys. Vandalism holds communicative value: perhaps to the van- dal themselves, to some audience at whom the vandalism is aimed, and to the general public. Viewing vandals as rational community participants despite their antinormative behavior offers the possibility of engaging with or countering their choices in novel ways. Rational choice theory (RCT) as applied in value expectancy theory (VET) offers a strategy for characterizing behaviors in a framework of rational choices, and begins with the supposition that subject to some weighting of personal preferences and constraints, individuals maximize their own utility by committing acts of vandalism. This study applies the framework of RCT and VET to gain insight into vandals’ preferences and constraints. Using a mixed-methods analysis of Wikipedia, I combine social computing and criminological perspectives on vandalism to propose an ontology of vandalism for online content communities. I use this ontology to categorize 141 instances of vandalism and find that the character of vandalistic acts varies by vandals’ relative identifiability, policy history with Wikipedia, and the effort required to vandalize.  more » « less
Award ID(s):
2031951 1703049
PAR ID:
10176369
Author(s) / Creator(s):
Date Published:
Journal Name:
International Conference on Social Media and Society
Page Range / eLocation ID:
47 to 57
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, p value = 2.6 × 10 –138 ), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, p value = 4.7 × 10 –84 ) when compared to other methods. This work demonstrates how high-dimensional representations of food can be used to populate ontologies and paves the way for learning ontologies that integrate contextual information from a variety of sources and types. 
    more » « less
  2. Lau, Eric HY (Ed.)
    Randomized controlled trials (RCTs) evaluate hypotheses in specific contexts and are often considered the gold standard of evidence for infectious disease interventions, but their results cannot immediately generalize to other contexts (e.g., different populations, interventions, or disease burdens). Mechanistic models are one approach to generalizing findings between contexts, but infectious disease transmission models (IDTMs) are not immediately suited for analyzing RCTs, since they often rely on time-series surveillance data. We developed an IDTM framework to explain relative risk outcomes of an infectious disease RCT and applied it to a water, sanitation, and hygiene (WASH) RCT. This model can generalize the RCT results to other contexts and conditions. We developed this compartmental IDTM framework to account for key WASH RCT factors: i) transmission across multiple environmental pathways, ii) multiple interventions applied individually and in combination, iii) adherence to interventions or preexisting conditions, and iv) the impact of individuals not enrolled in the study. We employed a hybrid sampling and estimation framework to obtain posterior estimates of mechanistic parameter sets consistent with empirical outcomes. We illustrated our model using WASH Benefits Bangladesh RCT data (n = 17,187). Our model reproduced reported diarrheal prevalence in this RCT. The baseline estimate of the basic reproduction number R 0 for the control arm (1.10, 95% CrI: 1.07, 1.16) corresponded to an endemic prevalence of 9.5% (95% CrI: 7.4, 13.7%) in the absence of interventions or preexisting WASH conditions. No single pathway was likely able to sustain transmission: pathway-specific R 0 s for water, fomites, and all other pathways were 0.42 (95% CrI: 0.03, 0.97), 0.20 (95% CrI: 0.02, 0.59), and 0.48 (95% CrI: 0.02, 0.94), respectively. An IDTM approach to evaluating RCTs can complement RCT analysis by providing a rigorous framework for generating data-driven hypotheses that explain trial findings, particularly unexpected null results, opening up existing data to deeper epidemiological understanding. 
    more » « less
  3. User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users. 
    more » « less
  4. User-generated content sites routinely block contributions from users of privacy-enhancing proxies like Tor because of a perception that proxies are a source of vandalism, spam, and abuse. Although these blocks might be effective, collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible. One of the largest and most important user-generated content sites, Wikipedia, has attempted to block contributions from Tor users since as early as 2005. We demonstrate that these blocks have been imperfect and that thousands of attempts to edit on Wikipedia through Tor have been successful. We draw upon several data sources and analytical techniques to measure and describe the history of Tor editing on Wikipedia over time and to compare contributions from Tor users to those from other groups of Wikipedia users. Our analysis suggests that although Tor users who slip through Wikipedia's ban contribute content that is more likely to be reverted and to revert others, their contributions are otherwise similar in quality to those from other unregistered participants and to the initial contributions of registered users. 
    more » « less
  5. null (Ed.)
    With the growing industry applications of Artificial Intelligence (AI) systems, pre-trained models and APIs have emerged and greatly lowered the barrier of building AI-powered products. However, novice AI application designers often struggle to recognize the inherent algorithmic trade-offs and evaluate model fairness before making informed design decisions. In this study, we examined the Objective Revision Evaluation System (ORES), a machine learning (ML) API in Wikipedia used by the community to build anti-vandalism tools. We designed an interactive visualization system to communicate model threshold trade-offs and fairness in ORES. We evaluated our system by conducting 10 in-depth interviews with potential ORES application designers. We found that our system helped application designers who have limited ML backgrounds learn about in-context ML knowledge, recognize inherent value trade-offs, and make design decisions that aligned with their goals. By demonstrating our system in a real-world domain, this paper presents a novel visualization approach to facilitate greater accessibility and human agency in AI application design. 
    more » « less