The diffusion of information about open-source projects is a key factor influencing the adoption of projects and the allocation of developer efforts. Developers learn about new projects, and evaluate their quality and importance by accessing the related information. Social media is an important channel for information diffusion about open-source projects, with previous research suggesting the existence of a social media ecosystem that consists of multiple platforms and collectively supports information diffusion in open source. With different features supporting information diffusion, the same piece of information likely reaches different developer communities on different platforms, which attracts the attention and contribution of different developers and thus influences the success of open-source projects. Despite its importance, few works looked at the identity of the developer community that projectrelated information reaches on social media platforms and its associated impact on the discussed project. In this work, we track social media discussions on open-source projects on three different platforms: Twitter, HackerNews, and Reddit. We first describe the dynamics of project-related information diffusion across platforms, and we analyze the association between the number of posts on each platform, and the number of developers attracted to the discussed project from different communities. We find that posts about open-source projects first appear on Twitter and HackerNews, then move more towards Reddit. The number of project-related posts on Twitter mostly associate with the attracted developers from communities that are close to the project’s main contributor, while posts on other platforms associate more with the attention from remote communities.
more »
« less
iDRAMA-Scored-2024: A Dataset of the Scored Social Media Platform from 2020 to 2023
Online web communities often face bans for violating platform policies, encouraging their migration to alternative platforms. This migration, however, can result in increased toxicity and unforeseen consequences on the new platform. In recent years, researchers have collected data from many alternative platforms, indicating coordinated efforts leading to offline events, conspiracy movements, hate speech propagation, and harassment. Thus, it becomes crucial to characterize and understand these alternative platforms. To advance research in this direction, we collect and release a large-scale dataset from Scored -- an alternative Reddit platform that sheltered banned fringe communities, for example, c/TheDonald (a prominent right-wing community) and c/GreatAwakening (a conspiratorial community). Over four years, we collected approximately 57M posts from Scored, with at least 58 communities identified as migrating from Reddit and over 950 communities created since the platform's inception. Furthermore, we provide sentence embeddings of all posts in our dataset, generated through a state-of-the-art model, to further advance the field in characterizing the discussions within these communities. We aim to provide these resources to facilitate their investigations without the need for extensive data collection and processing efforts.
more »
« less
- Award ID(s):
- 2247867
- PAR ID:
- 10545199
- Publisher / Repository:
- AAAI
- Date Published:
- Journal Name:
- Proceedings of the International AAAI Conference on Web and Social Media
- Volume:
- 18
- ISSN:
- 2162-3449
- Page Range / eLocation ID:
- 2014 to 2024
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Veterans are a unique marginalized group facing multiple vulnerabilities. Current assessments of veteran needs and support largely come from first-person accounts guided by researchers' prompts. Social media platforms not only enable veterans to connect with each other, but also to self-disclose experiences and seek support. This paper addresses the gap in our understanding of veteran needs and their own support dynamics by examining self-initiated and ecologically-valid self-expressions. In particular, we adopt the Veteran Critical Theory (VCT) to conduct a computational study on the Reddit community of veterans. Using topic modeling, we find veteran-friendly gestures with good intentions might not be appreciated in the subreddit. By employing transfer learning methodologies, we find this community has more informational and emotional support behaviors than general online communities and a higher prevalence of informational support than emotional support. Lastly, an examination of support dynamics reveals some contrasts to previous scholarship in military culture and social media. We discover that positive language and author platform tenure have negative relations with posts receiving replies and replies getting votes, and that replies reflecting personal disclosures tend to get more votes. Through the lens of VCT, we discuss how online communities can help uncover veterans' needs and provide more effective social support.more » « less
-
All That’s Happening behind the Scenes: Putting the Spotlight on Volunteer Moderator Labor in RedditOnline volunteers are an uncompensated yet valuable labor force for many social platforms. For example, volunteer content moderators perform a vast amount of labor to maintain online communities. However, as social platforms like Reddit favor revenue generation and user engagement, moderators are under-supported to manage the expansion of online communities. To preserve these online communities, developers and researchers of social platforms must account for and support as much of this labor as possible. In this paper, we quantitatively characterize the publicly visible and invisible actions taken by moderators on Reddit, using a unique dataset of private moderator logs for 126 subreddits and over 900 moderators. Our analysis of this dataset reveals the heterogeneity of moderation work across both communities and moderators. Moreover, we find that analyzing only visible work – the dominant way that moderation work has been studied thus far – drastically underestimates the amount of human moderation labor on a subreddit. We discuss the implications of our results on content moderation research and social platforms.more » « less
-
Rules are a critical component of the functioning of nearly every online community, yet it is challenging for community moderators to make data-driven decisions about what rules to set for their communities. The connection between a community's rules and how its membership feels about its governance is not well understood. In this work, we conduct the largest-to-date analysis of rules on Reddit, collecting a set of 67,545 unique rules across 5,225 communities which collectively account for more than 67% of all content on Reddit. More than just a point-in-time study, our work measures how communities change their rules over a 5+ year period. We develop a method to classify these rules using a taxonomy of 17 key attributes extended from previous work. We assess what types of rules are most prevalent, how rules are phrased, and how they vary across communities of different types. Using a dataset of communities' discussions about their governance, we are the first to identify the rules most strongly associated with positive community perceptions of governance: rules addressing who participates, how content is formatted and tagged, and rules about commercial activities. We conduct a longitudinal study to quantify the impact of adding new rules to communities, finding that after a rule is added, community perceptions of governance immediately improve, yet this effect diminishes after six months. Our results have important implications for platforms, moderators, and researchers. We make our classification model and rules datasets public to support future research on this topic.more » « less
-
Voluntary sharing of personal information is at the heart of user engagement on social media and central to platforms' business models. From the users' perspective, so-called self-disclosure is closely connected with both privacy risks and social rewards. Prior work has studied contextual influences on self-disclosure, from platform affordances and interface design to user demographics and perceived social capital. Our work takes a mixed-methods approach to understand the contextual information which might be integrated in the development of privacy-enhancing technologies. Through observational study of several Reddit communities, we explore the ways in which topic of discussion, group norms, peer effects, and audience size are correlated with personal information sharing. We then build and test a prototype privacy-enhancing tool that exposes these contextual factors. Our work culminates in a browser extension that automatically detects instances of self-disclosure in Reddit posts at the time of posting and provides additional context to users before they post to support enhanced privacy decision-making. We share this prototype with social media users, solicit their feedback, and outline a path forward for privacy-enhancing technologies in this space.more » « less
An official website of the United States government

