NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Disinformation 2.0 in the Age of AI: A Cybersecurity Perspective

https://doi.org/10.1145/3624721

Mazurczyk, Wojciech; Lee, Dongwon; Vlachos, Andreas (March 2024, Communications of the ACM)

Why disinformation is a cyber threat.
more » « less
Full Text Available
ALISON: Fast and Effective Stylometric Authorship Obfuscation

https://doi.org/10.1609/aaai.v38i17.29901

Xing, Eric; Venkatraman, Saranya; Le, Thai; Lee, Dongwon (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Authorship Attribution (AA) and Authorship Obfuscation (AO) are two competing tasks of increasing importance in privacy research. Modern AA leverages an author's consistent writing style to match a text to its author using an AA classifier. AO is the corresponding adversarial task, aiming to modify a text in such a way that its semantics are preserved, yet an AA model cannot correctly infer its authorship. To address privacy concerns raised by state-of-the-art (SOTA) AA methods,new AO methods have been proposed but remain largely impractical to use due to their prohibitively slow training and obfuscation speed, often taking hours.To this challenge, we propose a practical AO method, ALISON, that (1) dramatically reduces training/obfuscation time, demonstrating more than 10x faster obfuscation than SOTA AO methods, (2) achieves better obfuscation success through attacking three transformer-based AA methods on two benchmark datasets, typically performing 15% better than competing methods, (3) does not require direct signals from a target AA classifier during obfuscation, and (4) utilizes unique stylometric features, allowing sound model interpretation for explainable obfuscation. We also demonstrate that ALISON can effectively prevent four SOTA AA methods from accurately determining the authorship of ChatGPT-generated texts, all while minimally changing the original text semantics. To ensure the reproducibility of our findings, our code and data are available at: https://github.com/EricX003/ALISON.
more » « less
Full Text Available
Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

https://doi.org/10.18653/v1/2023.emnlp-main.883

Lucas, Jason; Uchendu, Adaku; Yamashita, Michiharu; Lee, Jooyoung; Rohatgi, Shaurya; Lee, Dongwon (December 2023, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing)

Full Text Available
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts

https://doi.org/10.18653/v1/2024.acl-long.357

Tripto, Nafis Irtiza; Venkatraman, Saranya; Macko, Dominik; Moro, Robert; Srba, Ivan; Uchendu, Adaku; Le, Thai; Lee, Dongwon (January 2024, Association for Computational Linguistics)

Full Text Available
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

https://doi.org/10.1145/3606274.3606276

Uchendu, Adaku; Le, Thai; Lee, Dongwon. (July 2023, ACM SIGKDD Explorations Newsletter)

Two interlocking research questions of growing interest and importance in privacy research are Authorship Attribution (AA) and Authorship Obfuscation (AO). Given an artifact, especially a text t in question, an AA solution aims to accurately attribute t to its true author out of many candidate authors while an AO solution aims to modify t to hide its true authorship. Traditionally, the notion of authorship and its accompanying privacy concern is only toward human authors. However, in recent years, due to the explosive advancements in Neural Text Generation (NTG) techniques in NLP, capable of synthesizing human-quality openended texts (so-called neural texts), one has to now consider authorships by humans, machines, or their combination. Due to the implications and potential threats of neural texts when used maliciously, it has become critical to understand the limitations of traditional AA/AO solutions and develop novel AA/AO solutions in dealing with neural texts. In this survey, therefore, we make a comprehensive review of recent literature on the attribution and obfuscation of neural text authorship from a Data Mining perspective, and share our view on their limitations and promising research directions.
more » « less
Full Text Available
Information Operations in Turkey: Manufacturing Resilience with Free Twitter Accounts

https://doi.org/10.1609/icwsm.v17i1.22175

Merhi, Maya; Rajtmajer, Sarah; Lee, Dongwon (June 2023, Proceedings of the International AAAI Conference on Web and Social Media)

Following the 2016 US elections Twitter launched their Information Operations (IO) hub where they archive account activity connected to state linked information operations. In June 2020, Twitter took down and released a set of accounts linked to Turkey's ruling political party (AKP). We investigate these accounts in the aftermath of the takedown to explore whether AKP-linked operations are ongoing and to understand the strategies they use to remain resilient to disruption. We collect live accounts that appear to be part of the same network, ~30% of which have been suspended by Twitter since our collection. We create a BERT-based classifier that shows similarity between these two networks, develop a taxonomy to categorize these accounts, find direct sequel accounts between the Turkish takedown and the live accounts, and find evidence that Turkish IO actors deliberately construct their network to withstand large-scale shutdown by utilizing explicit and implicit signals of coordination. We compare our findings from the Turkish operation to Russian and Chinese IO on Twitter and find that Turkey's IO utilizes a unique group structure to remain resilient. Our work highlights the fundamental imbalance between IO actors quickly and easily creating free accounts and the social media platforms spending significant resources on detection and removal, and contributes novel findings about Turkish IO on Twitter.
more » « less
Full Text Available
Associative Inference Can Increase People's Susceptibility to Misinformation

https://doi.org/10.1609/icwsm.v17i1.22166

Lee, Sian Lee; Seo, Haeseung; Lee, Dongwon (June 2023, 17th Int'l AAAI Conf. on Web and Social Media (ICWSM))

Full Text Available
Investigating the Impact of Skill-Related Videos on Online Learning

https://doi.org/10.1145/3573051.3593376

Prihar, Ethan; Haim, Aaron; Shen, Tracy; Sales, Adam; Lee, Dongwon; Wu, Xintao; Heffernan, Neil (July 2023, L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale)

Many online learning platforms and MOOCs incorporate some amount of video-based content into their platform, but there are few randomized controlled experiments that evaluate the effectiveness of the different methods of video integration. Given the large amount of publicly available educational videos, an investigation into this content's impact on students could help lead to more effective and accessible video integration within learning platforms. In this work, a new feature was added into an existing online learning platform that allowed students to request skill-related videos while completing their online middle-school mathematics assignments. A total of 18,535 students participated in two large-scale randomized controlled experiments related to providing students with publicly available educational videos. The first experiment investigated the effect of providing students with the opportunity to request these videos, and the second experiment investigated the effect of using a multi-armed bandit algorithm to recommend relevant videos. Additionally, this work investigated which features of the videos were significantly predictive of students' performance and which features could be used to personalize students' learning. Ultimately, students were mostly disinterested in the skill-related videos, preferring instead to use the platforms existing problem-specific support, and there was no statistically significant findings in either experiment. Additionally, while no video features were significantly predictive of students' performance, two video features had significant qualitative interactions with students' prior knowledge, which showed that different content creators were more effective for different groups of students. These findings can be used to inform the design of future video-based features within online learning platforms and the creation of different educational videos specifically targeting higher or lower knowledge students. The data and code used in this work can be found at https://osf.io/cxkzf/.
more » « less
Full Text Available
Do Language Models Plagiarize?

https://doi.org/10.1145/3543507.3583199

Lee, Jooyoung; Le, Thai; Chen, Jinghui; Lee, Dongwon (April 2023, The ACM Web Conference (WWW))

Full Text Available
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning

https://doi.org/10.18653/v1/2023.findings-emnlp.800

Wang, Ziyao; Le, Thai; Lee, Dongwon (January 2023, Association for Computational Linguistics)

Full Text Available

« Prev Next »

Search for: All records