skip to main content


Search for: All records

Award ID contains: 2150217

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Remote password guessing attacks remain one of the largest sources of account compromise. Understanding and characterizing attacker strategies is critical to improving security but doing so has been challenging thus far due to the sensitivity of login services and the lack of ground truth labels for benign and malicious login requests. We perform an in-depth measurement study of guessing attacks targeting two large universities. Using a rich dataset of more than 34 million login requests to the two universities as well as thousands of compromise reports, we were able to develop a new analysis pipeline to identify 29 attack clusters—many of which involved compromises not previously known to security engineers. Our analysis provides the richest investigation to date of password guessing attacks as seen from login services. We believe our tooling will be useful in future efforts to develop real-time detection of attack campaigns, and our characterization of attack campaigns can help more broadly guide mitigation design. 
    more » « less
    Free, publicly-accessible full text available October 26, 2024
  2. Social media users have long been aware of opaque content moderation systems and how they shape platform environments. On TikTok, creators increasingly utilize algospeak to circumvent unjust content restriction, meaning, they change or invent words to prevent TikTok’s content moderation algorithm from banning their video (e.g., “le$bean” for “lesbian”). We interviewed 19 TikTok creators about their motivations and practices of using algospeak in relation to their experience with TikTok’s content moderation. Participants largely anticipated how TikTok’s algorithm would read their videos, and used algospeak to evade unjustified content moderation while simultaneously ensuring target audiences can still find their videos. We identify non-contextuality, randomness, inaccuracy, and bias against marginalized communities as major issues regarding freedom of expression, equality of subjects, and support for communities of interest. Using algospeak, we argue for a need to improve contextually informed content moderation to valorize marginalized and tabooed audiovisual content on social media.

     
    more » « less
    Free, publicly-accessible full text available July 1, 2024
  3. Algospeak refers to social media users intentionally altering or substituting words when creating or sharing online content, for example, using ‘le$bean’ for ‘lesbian’. This study discusses the characteristics of algospeak as a computer-mediated language phenomenon on TikTok with regards to users’ algorithmic literacy and their awareness of how the platform’s algorithms work. We then present results from an interview study with TikTok creators on their motivations to utilize algospeak. Our results indicate that algospeak is used to oppose TikTok’s algorithmic moderation system in order to prevent unjust content violations and shadowbanning when posting about benign yet seemingly unwanted subjects on TikTok. In this, we find that although algospeak helps to prevent consequences, it often impedes the creation of quality content. We provide an adapted definition of algospeak and new insights into user-platform interactions in the context of algorithmic systems and algorithm awareness. 
    more » « less
    Free, publicly-accessible full text available April 30, 2024
  4. The test suites of an Android app should take advantage of different types of tests including end-to-end tests, which validate user flows, and unit tests, which provide focused executions for debugging. App developers have two main options when creating unit tests: create unit tests that run on a device (either physical or emulated) or create unit tests that run on a development machine’s Java Virtual Machine (JVM). Unit tests that run on a device are not really focused, as they use the full implementation of the Android framework. Moreover, they are fairly slow to execute, requiring the Android system as the runtime. Unit tests that run on the JVM, instead, are more focused and run more efficiently but require developers to suitably handle the coupling between the app under test and the Android framework. To help developers in creating focused unit tests that run on the JVM, we propose a novel technique called ARTISAN based on the idea of test carving. The technique (i) traces the app execution during end-to-end testing on Android devices, (ii) identifies focal methods to test, (iii) carves the necessary preconditions for testing those methods, (iv) creates suitable test doubles for the Android framework, and (v) synthesizes executable unit tests that can run on the JVM. We evaluated ARTISAN using 152 end-to-end tests from five apps and observed that ARTISAN can generate unit tests that cover a significant portion of the code exercised by the end-to-end tests (i.e., 45% of the starting statement coverage on average) and does so in a few minutes. 
    more » « less
  5. This study examines the content and layout of the proposed broadband consumer disclosure labels mandated by the U.S. Federal Communications Commission (FCC). Our large-scale user study identifies key consumer preferences and comprehension factors through a two-phase survey of 2,500 broadband internet consumers. Findings reveal strong support for broadband labels, but dissatisfaction with the FCC's proposed labels from 2016. Participants generally struggled to use the label for cost computations and plan comparisons. Technical terms confused participants, but providing participants with brief education made the terms usable. Participants desired additional information, including reliability, speed measures for both periods when performance is “normal” and periods when performance is much worse than normal, quality-of-experience ratings, and detailed network management practices. This feedback informed our improved label designs that outperformed the 2016 labels in comprehension and preference. Overall, consumers valued clear pricing and performance details, comprehensive information, and an easy-to-understand format for plan comparison. Requiring broadband service providers to deposit machine-readable plan information in a publicly accessible database would enable third parties to further customize how information is presented to meet these consumer needs. Our work additionally highlights the need for user studies of labels to ensure they meet consumer demands. 
    more » « less
  6. In this paper we describe the iterative evaluation and refinement of a consent flow for a chatbot being developed by a large U.S. health insurance company. This chatbot’s use of a cloud service provider triggers a requirement for users to agree to a HIPAA authorization. We highlight remote usability study and online survey findings indicating that simplifying the interface and language of the consent flow can improve the user experience and help users who read the content understand how their data may be used. However, we observe that most users in our studies, even those using our improved consent flows, missed important information in the authorization until we asked them to review it again. We also show that many people are overconfident about the privacy and security of healthcare data and that many people believe HIPAA protects in far more contexts than it actually does. Given that our redesigns following best practices did not produce many meaningful improvements in informed consent, we argue for the need for research on alternate approaches to health data disclosures such as standardized disclosures; methods borrowed from clinical research contexts such as multimedia formats, quizzes, and conversational approaches; and automated privacy assistants. 
    more » « less
  7. Talks at practitioner-focused open-source software conferences are a valuable source of information for software engineering researchers. They provide a pulse of the community and are valuable source material for grey literature analysis. We curated a dataset of 24,669 talks from 87 open-source conferences between 2010 and 2021. We stored all relevant metadata from these conferences and provide scripts to collect the transcripts. We believe this data is useful for answering many kinds of questions, such as: What are the important/highly discussed topics within practitioner communities? How do practitioners interact? And how do they present themselves to the public? We demonstrate the usefulness of this data by reporting our findings from two small studies: a topic model analysis providing an overview of open-source community dynamics since 2011 and a qualitative analysis of a smaller community-oriented sample within our dataset to gain a better understanding of why contributors leave open-source software. 
    more » « less
  8. Open-source software has integrated itself into our daily lives, impacting 78% of US companies in 2015 [11]. Past studies of open-source community dynamics have found motivations behind contributions [3, 14, 18, 19] and the significance of community engagement [8, 17], but there are still many aspects not well understood. There's a direct correlation between the success of an open-source project and the social interactions within its community [7, 9, 17]. Most projects depend on a small group. A study by Avelino et al. [4] on the 133 most popular GitHub projects found that 86% will fail if one or two of its core contributors leave. To sustain open-source, we need to better understand how contributors interact, what information is shared, and what concerns practitioners have. We study common topics, how these have changed over time (2011 - 2021), and what social issues have appeared within open-source communities. Our research is guided by the following questions: (1) How is open-source changing/evolving? (2) What changes do practitioners believe are necessary for open-source to be sustainable? 
    more » « less
  9. Contributors are vital to the sustainability of open source ecosystems, and disengagement threatens that sustainability. We seek to both strengthen and protect open source communities by creating a more robust way of defining and identifying contributor disengagement in these communities. To do this, we collected a large amount of grey literature relating to contributor disengagement and performed a qualitative analysis in order to better our understanding of why contributors disengage. 
    more » « less