skip to main content


This content will become publicly available on May 14, 2024

Title: Understanding information diffusion about open-source projects on Twitter, HackerNews, and Reddit
The diffusion of information about open-source projects is a key factor influencing the adoption of projects and the allocation of developer efforts. Developers learn about new projects, and evaluate their quality and importance by accessing the related information. Social media is an important channel for information diffusion about open-source projects, with previous research suggesting the existence of a social media ecosystem that consists of multiple platforms and collectively supports information diffusion in open source. With different features supporting information diffusion, the same piece of information likely reaches different developer communities on different platforms, which attracts the attention and contribution of different developers and thus influences the success of open-source projects. Despite its importance, few works looked at the identity of the developer community that projectrelated information reaches on social media platforms and its associated impact on the discussed project. In this work, we track social media discussions on open-source projects on three different platforms: Twitter, HackerNews, and Reddit. We first describe the dynamics of project-related information diffusion across platforms, and we analyze the association between the number of posts on each platform, and the number of developers attracted to the discussed project from different communities. We find that posts about open-source projects first appear on Twitter and HackerNews, then move more towards Reddit. The number of project-related posts on Twitter mostly associate with the attracted developers from communities that are close to the project’s main contributor, while posts on other platforms associate more with the attention from remote communities.  more » « less
Award ID(s):
1901311
NSF-PAR ID:
10439400
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Cooperative and Human Aspects of Software Engineering (CHASE)
Page Range / eLocation ID:
56 to 67
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Social media, especially Twitter, has always been a part of the professional lives of software developers, with prior work reporting on a diversity of usage scenarios, including sharing information, staying current, and promoting one’s work. However, previous studies of Twitter use by software developers typically lack information about activities of the study subjects (and their outcomes) on other platforms. To enable such future research, in this paper we propose a computational approach to cross-link users across Twitter and GitHub, revealing (at least) 70,427 users active on both. As a preliminary analysis of this dataset, we report on a case study of 786 tweets by open-source developers about GitHub work, combining automatic characterization of tweet authors in terms of their relationship to the GitHub items linked in their tweets with qualitative analysis of the tweet contents. We find that different developer roles tend to have different tweeting behaviors, with repository owners being perhaps the most distinctive group compared to other project contributors and followers. We also note a sizeable group of people who follow others on GitHub and tweet about these people’s work, but do not otherwise contribute to those open-source projects. Our results and public dataset open up multiple future research directions. 
    more » « less
  2. Wren, Jonathan (Ed.)
    Abstract Motivation Substance abuse constitutes one of the major contemporary health epidemics. Recently, the use of social media platforms has garnered interest as a novel source of data for drug addiction epidemiology. Often however, the language used in such forums comprises slang and jargon. Currently, there are no publicly available resources to automatically analyse the esoteric language-use in the social media drug-use sub-culture. This lacunae introduces critical challenges for interpreting, sensemaking and modeling of addiction epidemiology using social media. Results Drug-Use Insights (DUI) is a public and open-source web application to address the aforementioned deficiency. DUI is underlined by a hierarchical taxonomy encompassing 108 different addiction related categories consisting of over 9,000 terms, where each category encompasses a set of semantically related terms. These categories and terms were established by utilizing thematic analysis in conjunction with term embeddings generated from 7,472,545 Reddit posts made by 1,402,017 redditors. Given post(s) from social media forums such as Reddit and Twitter, DUI can be used foremost to identify constituent terms related to drug use. Furthermore, the DUI categories and integrated visualization tools can be leveraged for semantic- and exploratory analysis. To the best of our knowledge, DUI utilizes the largest number of substance use and recovery social media posts used in a study and represents the first significant online taxonomy of drug abuse terminology. Availability The DUI web server and source code are available at: http://haddock9.sfsu.edu/insight/ Supplementary information Supplementary data are available at Bioinformatics online. 
    more » « less
  3. Introduction Social media has created opportunities for children to gather social support online (Blackwell et al., 2016; Gonzales, 2017; Jackson, Bailey, & Foucault Welles, 2018; Khasawneh, Rogers, Bertrand, Madathil, & Gramopadhye, 2019; Ponathil, Agnisarman, Khasawneh, Narasimha, & Madathil, 2017). However, social media also has the potential to expose children and adolescents to undesirable behaviors. Research showed that social media can be used to harass, discriminate (Fritz & Gonzales, 2018), dox (Wood, Rose, & Thompson, 2018), and socially disenfranchise children (Page, Wisniewski, Knijnenburg, & Namara, 2018). Other research proposes that social media use might be correlated to the significant increase in suicide rates and depressive symptoms among children and adolescents in the past ten years (Mitchell, Wells, Priebe, & Ybarra, 2014). Evidence based research suggests that suicidal and unwanted behaviors can be promulgated through social contagion effects, which model, normalize, and reinforce self-harming behavior (Hilton, 2017). These harmful behaviors and social contagion effects may occur more frequently through repetitive exposure and modelling via social media, especially when such content goes “viral” (Hilton, 2017). One example of viral self-harming behavior that has generated significant media attention is the Blue Whale Challenge (BWC). The hearsay about this challenge is that individuals at all ages are persuaded to participate in self-harm and eventually kill themselves (Mukhra, Baryah, Krishan, & Kanchan, 2017). Research is needed specifically concerning BWC ethical concerns, the effects the game may have on teenagers, and potential governmental interventions. To address this gap in the literature, the current study uses qualitative and content analysis research techniques to illustrate the risk of self-harm and suicide contagion through the portrayal of BWC on YouTube and Twitter Posts. The purpose of this study is to analyze the portrayal of BWC on YouTube and Twitter in order to identify the themes that are presented on YouTube and Twitter posts that share and discuss BWC. In addition, we want to explore to what extent are YouTube videos compliant with safe and effective suicide messaging guidelines proposed by the Suicide Prevention Resource Center (SPRC). Method Two social media websites were used to gather the data: 60 videos and 1,112 comments from YouTube and 150 posts from Twitter. The common themes of the YouTube videos, comments on those videos, and the Twitter posts were identified using grounded, thematic content analysis on the collected data (Padgett, 2001). Three codebooks were built, one for each type of data. The data for each site were analyzed, and the common themes were identified. A deductive coding analysis was conducted on the YouTube videos based on the nine SPRC safe and effective messaging guidelines (Suicide Prevention Resource Center, 2006). The analysis explored the number of videos that violated these guidelines and which guidelines were violated the most. The inter-rater reliabilities between the coders ranged from 0.61 – 0.81 based on Cohen’s kappa. Then the coders conducted consensus coding. Results & Findings Three common themes were identified among all the posts in the three social media platforms included in this study. The first theme included posts where social media users were trying to raise awareness and warning parents about this dangerous phenomenon in order to reduce the risk of any potential participation in BWC. This was the most common theme in the videos and posts. Additionally, the posts claimed that there are more than 100 people who have played BWC worldwide and provided detailed description of what each individual did while playing the game. These videos also described the tasks and different names of the game. Only few videos provided recommendations to teenagers who might be playing or thinking of playing the game and fewer videos mentioned that the provided statistics were not confirmed by reliable sources. The second theme included posts of people that either criticized the teenagers who participated in BWC or made fun of them for a couple of reasons: they agreed with the purpose of BWC of “cleaning the society of people with mental issues,” or they misunderstood why teenagers participate in these kind of challenges, such as thinking they mainly participate due to peer pressure or to “show off”. The last theme we identified was that most of these users tend to speak in detail about someone who already participated in BWC. These videos and posts provided information about their demographics and interviews with their parents or acquaintances, who also provide more details about the participant’s personal life. The evaluation of the videos based on the SPRC safe messaging guidelines showed that 37% of the YouTube videos met fewer than 3 of the 9 safe messaging guidelines. Around 50% of them met only 4 to 6 of the guidelines, while the remaining 13% met 7 or more of the guidelines. Discussion This study is the first to systematically investigate the quality, portrayal, and reach of BWC on social media. Based on our findings from the emerging themes and the evaluation of the SPRC safe messaging guidelines we suggest that these videos could contribute to the spread of these deadly challenges (or suicide in general since the game might be a hoax) instead of raising awareness. Our suggestion is parallel with similar studies conducted on the portrait of suicide in traditional media (Fekete & Macsai, 1990; Fekete & Schmidtke, 1995). Most posts on social media romanticized people who have died by following this challenge, and younger vulnerable teens may see the victims as role models, leading them to end their lives in the same way (Fekete & Schmidtke, 1995). The videos presented statistics about the number of suicides believed to be related to this challenge in a way that made suicide seem common (Cialdini, 2003). In addition, the videos presented extensive personal information about the people who have died by suicide while playing the BWC. These videos also provided detailed descriptions of the final task, including pictures of self-harm, material that may encourage vulnerable teens to consider ending their lives and provide them with methods on how to do so (Fekete & Macsai, 1990). On the other hand, these videos both failed to emphasize prevention by highlighting effective treatments for mental health problems and failed to encourage teenagers with mental health problems to seek help and providing information on where to find it. YouTube and Twitter are capable of influencing a large number of teenagers (Khasawneh, Ponathil, Firat Ozkan, & Chalil Madathil, 2018; Pater & Mynatt, 2017). We suggest that it is urgent to monitor social media posts related to BWC and similar self-harm challenges (e.g., the Momo Challenge). Additionally, the SPRC should properly educate social media users, particularly those with more influence (e.g., celebrities) on elements that boost negative contagion effects. While the veracity of these challenges is doubted by some, posting about the challenges in unsafe manners can contribute to contagion regardless of the challlenges’ true nature. 
    more » « less
  4. null (Ed.)
    Since the start of coronavirus disease 2019 (COVID-19) pandemic, social media platforms have been filled with discussions about the global health crisis. Meanwhile, the World Health Organization (WHO) has highlighted the importance of seeking credible sources of information on social media regarding COVID-19. In this study, we conducted an in-depth analysis of Twitter posts about COVID-19 during the early days of the COVID-19 pandemic to identify influential sources of COVID-19 information and understand the characteristics of these sources. We identified influential accounts based on an information diffusion network representing the interactions of Twitter users who discussed COVID-19 in the United States over a 24-h period. The network analysis revealed 11 influential accounts that we categorized as: 1) political authorities (elected government officials), 2) news organizations, and 3) personal accounts. Our findings showed that while verified accounts with a large following tended to be the most influential users, smaller personal accounts also emerged as influencers. Our analysis revealed that other users often interacted with influential accounts in response to news about COVID-19 cases and strongly contested political arguments received the most interactions overall. These findings suggest that political polarization was a major factor in COVID-19 information diffusion. We discussed the implications of political polarization on social media for COVID-19 communication. 
    more » « less
  5. Retracted papers often circulate widely on social media, digital news, and other websites before their official retraction. The spread of potentially inaccurate or misleading results from retracted papers can harm the scientific community and the public. Here, we quantify the amount and type of attention 3,851 retracted papers received over time in different online platforms. Comparing with a set of nonretracted control papers from the same journals with similar publication year, number of coauthors, and author impact, we show that retracted papers receive more attention after publication not only on social media but also, on heavily curated platforms, such as news outlets and knowledge repositories, amplifying the negative impact on the public. At the same time, we find that posts on Twitter tend to express more criticism about retracted than about control papers, suggesting that criticism-expressing tweets could contain factual information about problematic papers. Most importantly, around the time they are retracted, papers generate discussions that are primarily about the retraction incident rather than about research findings, showing that by this point, papers have exhausted attention to their results and highlighting the limited effect of retractions. Our findings reveal the extent to which retracted papers are discussed on different online platforms and identify at scale audience criticism toward them. In this context, we show that retraction is not an effective tool to reduce online attention to problematic papers. 
    more » « less