skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Need for Tweet: How Open Source Developers Talk About Their GitHub Work on Twitter.
Social media, especially Twitter, has always been a part of the professional lives of software developers, with prior work reporting on a diversity of usage scenarios, including sharing information, staying current, and promoting one’s work. However, previous studies of Twitter use by software developers typically lack information about activities of the study subjects (and their outcomes) on other platforms. To enable such future research, in this paper we propose a computational approach to cross-link users across Twitter and GitHub, revealing (at least) 70,427 users active on both. As a preliminary analysis of this dataset, we report on a case study of 786 tweets by open-source developers about GitHub work, combining automatic characterization of tweet authors in terms of their relationship to the GitHub items linked in their tweets with qualitative analysis of the tweet contents. We find that different developer roles tend to have different tweeting behaviors, with repository owners being perhaps the most distinctive group compared to other project contributors and followers. We also note a sizeable group of people who follow others on GitHub and tweet about these people’s work, but do not otherwise contribute to those open-source projects. Our results and public dataset open up multiple future research directions.  more » « less
Award ID(s):
1901311
PAR ID:
10190353
Author(s) / Creator(s):
Date Published:
Journal Name:
Mining Software Repositories
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Twitter is widely used by software developers. But how effective are tweets at promoting open source projects? How could one use Twitter to increase a project’s popularity or attract new contributors? In this paper we report on a mixed-methods empirical study of 44,544 tweets containing links to 2,370 open-source GitHub repositories, looking for evidence of causal effects of these tweets on the projects attracting new GitHub stars and contributors, as well as characterizing the high-impact tweets, the people likely being attracted by them, and how they differ from contributors attracted otherwise. Among others, we find that tweets have a statistically significant and practically sizable effect on obtaining new stars and a small average effect on attracting new contributors. The popularity, content of the tweet, as well as the identity of tweet authors all affect the scale of the attraction effect. In addition, our qualitative analysis suggests that forming an active Twitter community for an open source project plays an important role in attracting new committers via tweets. We also report that developers who are new to GitHub or have a long history of Twitter usage but few tweets posted are most likely to be attracted as contributors to the repositories mentioned by tweets. Our work contributes to the literature on open source sustainability. 
    more » « less
  2. null (Ed.)
    Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information. Using this dataset, we further investigate Long Short-Term Memory-based models within a Multi-Task Learning framework. The best performing model achieves an F1-score as high as $92.22%$. The dataset, code, and other resources are available on Github.1 
    more » « less
  3. The purpose of the Twitter Disaster Behavior project is to identify patterns in online behavior during natural disasters by analyzing Twitter data. The main goal is to better understand the needs of a community during and after a disaster, to aid in recovery. The datasets analyzed were collections of tweets about Hurricane Maria, and recent earthquake events, in Puerto Rico. All tweets pertaining to Hurricane Maria are from the timeframe of September 15 through October 14, 2017. Similarly, tweets pertaining to the Puerto Rico earthquake from January 7 through February 6, 2020 were collected. These tweets were then analyzed for their content, number of retweets, and the geotag associated with the author of the tweet. We counted the occurrence of key words in topics relating to preparation, response, impact, and recovery. This data was then graphed using Python and Matplotlib. Additionally, using a Twitter crawler, we extracted a large dataset of tweets by users that used geotags. These geotags are used to examine location changes among the users before, during, and after each natural disaster. Finally, after performing these analyses, we developed easy to understand visuals and compiled these figures into a poster. Using these figures and graphs, we compared the two datasets in order to identify any significant differences in behavior and response. The main differences we noticed stemmed from two key reasons: hurricanes can be predicted whereas earthquakes cannot, and hurricanes are usually an isolated event whereas earthquakes are followed by aftershocks. Thus, the Hurricane Maria dataset experienced the highest amount of tweet activity at the beginning of the event and the Puerto Rico earthquake dataset experienced peaks in tweet activity throughout the entire period, usually corresponding to aftershock occurrences. We studied these differences, as well as other important trends we identified. 
    more » « less
  4. Open source software is commonly portrayed as a meritocracy, where decisions are based solely on their technical merit. However, literature on open source suggests a complex social structure underlying the meritocracy. Social work environments such as GitHub make the relationships between users and between users and work artifacts transparent. This transparency enables developers to better use information such as technical value and social connections when making work decisions. We present a study on open source software contribution in GitHub that focuses on the task of evaluating pull requests, which are one of the primary methods for contributing code in GitHub. We analyzed the association of various technical and social measures with the likelihood of contribution acceptance. We found that project managers made use of information signaling both good technical contribution practices for a pull request and the strength of the social connection between the submitter and project manager when evaluating pull requests. Pull requests with many comments were much less likely to be accepted, moderated by the submitter's prior interaction in the project. Well-established projects were more conservative in accepting pull requests. These findings provide evidence that developers use both technical and social information when evaluating potential contributions to open source software projects 
    more » « less
  5. The diffusion of information about open-source projects is a key factor influencing the adoption of projects and the allocation of developer efforts. Developers learn about new projects, and evaluate their quality and importance by accessing the related information. Social media is an important channel for information diffusion about open-source projects, with previous research suggesting the existence of a social media ecosystem that consists of multiple platforms and collectively supports information diffusion in open source. With different features supporting information diffusion, the same piece of information likely reaches different developer communities on different platforms, which attracts the attention and contribution of different developers and thus influences the success of open-source projects. Despite its importance, few works looked at the identity of the developer community that projectrelated information reaches on social media platforms and its associated impact on the discussed project. In this work, we track social media discussions on open-source projects on three different platforms: Twitter, HackerNews, and Reddit. We first describe the dynamics of project-related information diffusion across platforms, and we analyze the association between the number of posts on each platform, and the number of developers attracted to the discussed project from different communities. We find that posts about open-source projects first appear on Twitter and HackerNews, then move more towards Reddit. The number of project-related posts on Twitter mostly associate with the attracted developers from communities that are close to the project’s main contributor, while posts on other platforms associate more with the attention from remote communities. 
    more » « less