skip to main content

Title: Towards Measuring Adversarial TwitterInteractions against Candidates in the US Midterm Elections
Adversarial interactions against politicians on social media such as Twitter have significant impact on society. In particular they disrupt substantive political discussions online, and may discourage people from seeking public office. In this study, we measure the adversarial interactions against candidates for the US House of Representatives during the run-up to the 2018 US general election. We gather a new dataset consisting of 1.7 million tweets involving candidates, one of the largest corpora focusing on political discourse. We then develop a new technique for detecting tweets with toxic con-tent that are directed at any specific candidate. Such technique allows us to more accurately quantify adversarial interactions towards political candidates. Further, we introduce an algorithm to induce candidate-specific adversarial terms to capture more nuanced adversarial interactions that previous techniques may not consider toxic. Finally, we use these techniques to outline the breadth of adversarial interactions seen in the election, including offensive name-calling, threats of violence, posting discrediting information, attacks on identity, and adversarial message repetition.
; ;
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the Fourteenth International AAAI Conference on Web and Social Media (ICWSM 2020)
Sponsoring Org:
National Science Foundation
More Like this
  1. The 2020 US Presidential election was historic in that it featured the first woman of color, Kamala Harris, on a major-party ticket. Although Harris identifies as Black, her racial identity was widely scrutinized throughout the election, due to her mixed-race ancestry. Moreover, media coverage of Harris's racial identity appeared to vary based on that news outlet's political leaning and sometimes had prejudicial undertones. The current research investigated racial categorization of Harris and the role that political orientation and anti-Black prejudice might play in shaping these categorizations. Studies 1 and 2 tested the possibility that conservatives and liberals might mentally represent Harris differently, which we hypothesized would lead the two groups to differ in how they categorized her race. Contrary to our prediction, conservatives, and liberals mentally represented Harris similarly. Also surprising were the explicit racial categorization data. Conservatives labeled Harris as White more than liberals, who tended to categorize Harris as multiracial. This pattern was explained by anti-Black prejudice. Study 3 examined a potential political motivation that might explain this finding. We found that conservatives, more than liberals, judge having a non-White candidate on a Democratic ballot as an asset, which may lead conservatives to deny non-White candidates these identities.
  2. Enterprise software updates depend on the interaction between user and developer organizations. This interaction becomes especially complex when a single developer organization writes software that services hundreds of different user organizations. Miscommunication during patching and deployment efforts lead to insecure or malfunctioning software installations. While developers oversee the code, the update process starts and ends outside their control. Since developer test suites may fail to capture buggy behavior finding and fixing these bugs starts with user generated bug reports and 3rd party disclosures. The process ends when the fixed code is deployed in production. Any friction between user, and developer results in a delay patching critical bugs. Two common causes for friction are a failure to replicate user specific circumstances that cause buggy behavior and incompatible software releases that break critical functionality. Existing test generation techniques are insufficient. They fail to test candidate patches for post-deployment bugs and to test whether the new release adversely effects customer workloads. With existing test generation and deployment techniques, users can't choose (nor validate) compatible portions of new versions and retain their previous version's functionality. We present two new technologies to alleviate this friction. First, Test Generation for Ad Hoc Circumstances transforms buggy executionsmore »into test cases. Second, Binary Patch Decomposition allows users to select the compatible pieces of update releases. By sharing specific context around buggy behavior and developers can create specific test cases that demonstrate if their fixes are appropriate. When fixes are distributed by including extra context users can incorporate only updates that guarantee compatibility between buggy and fixed versions. We use change analysis in combination with binary rewriting to transform the old executable and buggy execution into a test case including the developer's prospective changes that let us generate and run targeted tests for the candidate patch. We also provide analogous support to users, to selectively validate and patch their production environments with only the desired bug-fixes from new version releases. This paper presents a new patching workflow that allows developers to validate prospective patches and users to select which updates they would like to apply, along with two new technologies that make it possible. We demonstrate our technique constructs tests cases more effectively and more efficiently than traditional test case generation on a collection of real world bugs compared to traditional test generation techniques, and provides the ability for flexible updates in real world scenarios.« less
  3. Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls.
  4. When one searches for political candidates on Google, a panel composed of recent news stories, known as Top stories, is commonly shown at the top of the search results page. These stories are selected by an algorithm that chooses from hundreds of thousands of articles published by thousands of news publishers. In our previous work, we identified 56 news sources that contributed 2/3 of all Top stories for 30 political candidates running in the primaries of 2020 US Presidential Election. In this paper, we survey US voters to elicit their familiarity and trust with these 56 news outlets. We find that some of the most frequent outlets are not familiar to all voters (e.g. The Hill or Politico), or particularly trusted by voters of any political stripes (e.g. Washington Examiner or The Daily Beast). Why then, are such sources shown so frequently in Top stories? We theorize that Google is sampling news articles from sources with different political leanings to offer a balanced coverage. This is reminiscent of the so-called “fairness doctrine” (1949-1987) policy in the United States that required broadcasters (radio or TV stations) to air contrasting views about controversial matters. Because there are fewer right-leaning publications than centermore »or left-leaning ones, in order to maintain this “fair” balance, hyper-partisan far-right news sources of low trust receive more visibility than some news sources that are more familiar to and trusted by the public.« less
  5. This paper focuses on a newly challenging setting in hard-label adversarial attacks on text data by taking the budget information into account. Although existing approaches can successfully generate adversarial examples in the hard-label setting, they follow an ideal assumption that the victim model does not restrict the number of queries. However, in real-world applications the query budget is usually tight or limited. Moreover, existing hard-label adversarial attack techniques use the genetic algorithm to optimize discrete text data by maintaining a number of adversarial candidates during optimization, which can lead to the problem of generating low-quality adversarial examples in the tight-budget setting. To solve this problem, in this paper, we propose a new method named TextHoaxer by formulating the budgeted hard-label adversarial attack task on text data as a gradient-based optimization problem of perturbation matrix in the continuous word embedding space. Compared with the genetic algorithm-based optimization, our solution only uses a single initialized adversarial example as the adversarial candidate for optimization, which significantly reduces the number of queries. The optimization is guided by a new objective function consisting of three terms, i.e., semantic similarity term, pair-wise perturbation constraint, and sparsity constraint. Semantic similarity term and pair-wise perturbation constraint can ensuremore »the high semantic similarity of adversarial examples from both comprehensive text-level and individual word-level, while the sparsity constraint explicitly restricts the number of perturbed words, which is also helpful for enhancing the quality of generated text. We conduct extensive experiments on eight text datasets against three representative natural language models, and experimental results show that TextHoaxer can generate high-quality adversarial examples with higher semantic similarity and lower perturbation rate under the tight-budget setting.« less