GitHub OSS Governance File Dataset
Open-source Software (OSS) has become a valuable resource in both industry and academia over the last few decades. Despite the innovative structures they develop to support the projects, OSS projects and their communities have complex needs and face risks such as getting abandoned. To manage the internal social dynamics and community evolution, OSS developer communities have started relying on written governance documents that assign roles and responsibilities to different community actors. To facilitate the study of the impact and effectiveness of formal governance documents on OSS projects and communities, we present a longitudinal dataset of 710 GitHub-hosted OSS projects with GOVERNANCE.MD governance files. This dataset includes all commits made to the repository, all issues and comments created on GitHub, and all revisions made to the governance file. We hope its availability will foster more research interest in studying how OSS communities govern their projects and the impact of governance files on communities.
more »
« less
- Award ID(s):
- 2217653
- PAR ID:
- 10426293
- Date Published:
- Journal Name:
- IEEE International Working Conference on Mining Software Repositories
- ISSN:
- 2160-1852
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
The involvement of companies and public institutions in open-source software (OSS) has become widespread. While studies have explored the business models of for-profit organizations and their impact on software quality, little is known about their influence on OSS communities, especially in terms of diversity and inclusion. This knowledge gap is significant, considering that many organizations have the resources to enhance diversity and inclusion internally, but whether these efforts extend to OSS remains uncertain. To address this gap, we conducted interviews with maintainers of community-owned and organization-owned OSS projects, revealing tensions between organizations and their projects and identifying the impact of internal policies on OSS communities. Our findings reveal that, on the one hand, organization-owned projects often restrict external contributions due to stringent operating procedures and segmented communication, leading to limited external engagement. On the other hand, these organizations positively influence diversity and inclusion, notably in the representation and roles of women and the implementation of mentorship programs.more » « less
-
Abstract Several Open-Source Software (OSS) projects depend on the continuity of their development communities to remain sustainable. Understanding how developers become inactive or why they take breaks can help communities prevent abandonment and incentivize developers to come back. In this paper, we propose a novel method to identify developers’ inactive periods by analyzing the individual rhythm of contributions to the projects. Using this method, we quantitatively analyze the inactivity of core developers in 18 OSS organizations hosted on GitHub. We also survey core developers to receive their feedback about the identified breaks and transitions. Our results show that our method was effective for identifying developers’ breaks. About 94% of the surveyed core developers agreed with our state model of inactivity; 71% and 79% of them acknowledged their breaks and state transition, respectively. We also show that all core developers take breaks (at least once) and about a half of them (~45%) have completely disengaged from a project for at least one year. We also analyzed the probability of transitions to/from inactivity and found that developers who pause their activity have a ~35 to ~55% chance to return to an active state; yet, if the break lasts for a year or longer, then the probability of resuming activities drops to ~21–26%, with a ~54% chance of complete disengagement. These results may support the creation of policies and mechanisms to make OSS community managers aware of breaks and potential project abandonment.more » « less
-
Corporate involvement in open source software (OSS) communities has increased substantially in recent years. Often this takes the form of company employees devoting their time to contribute code to the efforts of projects in these communities. Ideology has traditionally served to motivate, coordinate, and guide volunteer contributions to OSS communities. As employees represent an increasing proportion of the participants in OSS communities, the role of OSS ideology in guiding their commitment and code contributions is unknown. In this research, we argue that OSS ideology misfit has important implications for companies and the OSS communities to which their employees contribute, since their engagement in such communities is not necessarily voluntary. We conceptualize two different types of misfit: OSS ideology under-fit, whereby an employee embraces an OSS ideology more than their coworkers or OSS community do, and OSS ideology overfit, whereby an employee perceives that their coworkers or OSS community embrace the OSS ideology more strongly than the employee does. To develop a set of hypotheses about the implications of these two types of misfit for employee commitment to the company and commitment to the OSS community, we draw on selfdetermination theory. We test the hypotheses in a field study of 186 employees who participate in an OSS community. We find that OSS ideology under-fit impacts the company and the community in the same way: it decreases employee commitment to the company and commitment to the OSS community. In contrast, we find that OSS ideology over-fit increases commitment to the company but decreases commitment to the OSS community. Finally, we find that employees’ commitment to their company reinforces the impact of their commitment to the OSS community in driving ongoing code contributions. This provides a holistic view of OSS ideology and its impacts among an increasingly pervasive yet understudied type of participant in OSS research. It provides insights for companies that are considering assigning their employees to work in OSS communities as well as for OSS communities that are partnering with these companies.more » « less
-
Sustainable Open Source Software (OSS) forms much of the fabric of our digital society, especially successful and sustainable ones. But many OSS projects do not become sustainable, resulting in abandonment and even risks for the world's digital infrastructure. Prior work has looked at the reasons for this mainly from two very different perspectives. In software engineering, the focus has been on understanding success and sustainability from the socio-technical perspective: the OSS programmers' day-to-day activities and the artifacts they create. In institutional analysis, on the other hand, emphasis has been on institutional designs (e.g., policies, rules, and norms) that structure project governance. Even though each is necessary for a comprehensive understanding of OSS projects, the connection and interaction between the two approaches have been barely explored. In this paper, we make the first effort toward understanding OSS project sustainability using a dual-view analysis, by combining institutional analysis with socio-technical systems analysis. In particular, we (i) use linguistic approaches to extract institutional rules and norms from OSS contributors' communications to represent the evolution of their governance systems, and (ii) construct socio-technical networks based on longitudinal collaboration records to represent each project's organizational structure. We combined the two methods and applied them to a dataset of developer digital traces from 253 nascent OSS projects within the Apache Software Foundation (ASF) incubator. We find that the socio-technical and institutional features relate to each other, and provide complimentary views into the progress of the ASF's OSS projects. Refining these combined analyses can help provide a more precise understanding of the synchronization between the evolution of institutional governance and organizational structure.more » « less