skip to main content


This content will become publicly available on April 30, 2024

Title: Code Recommendation for Open Source Software Developers
Open Source Software (OSS) is forming the spines of technology infrastructures, attracting millions of talents to contribute. Notably, it is challenging and critical to consider both the developers’ interests and the semantic features of the project code to recommend appropriate development tasks to OSS developers. In this paper, we formulate the novel problem of code recommendation, whose purpose is to predict the future contribution behaviors of developers given their interaction history, the semantic features of source code, and the hierarchical file structures of projects. We introduce CODER, a novel graph-based CODE Recommendation framework for open source software developers, which accounts for the complex interactions among multiple parties within the system. CODER jointly models microscopic user-code interactions and macroscopic user-project interactions via a heterogeneous graph and further bridges the two levels of information through aggregation on filestructure graphs that reflect the project hierarchy. Moreover, to overcome the lack of reliable benchmarks, we construct three largescale datasets to facilitate future research in this direction. Extensive experiments show that our CODER framework achieves superior performance under various experimental settings, including intraproject, cross-project, and cold-start recommendation.  more » « less
Award ID(s):
2211557 1937599
NSF-PAR ID:
10464425
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
WWW '23: Proceedings of the ACM Web Conference 2023
Page Range / eLocation ID:
1324 to 1333
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Software bots are used by Open Source Software (OSS) projects to streamline the code review process. Interfacing between developers and automated services, code review bots report continuous integration failures, code quality checks, and code coverage. However, the impact of such bots on maintenance tasks is still neglected. In this paper, we study how project maintainers experience code review bots. We surveyed 127 maintainers and asked about their expectations and perception of changes incurred by code review bots. Our findings reveal that the most frequent expectations include enhancing the feedback bots provide to developers, reducing the maintenance burden for developers, and enforcing code coverage. While maintainers report that bots satisfied their expectations, they also perceived unexpected effects, such as communication noise and newcomers' dropout. Based on these results, we provide a series of implications for bot developers, as well as insights for future research. 
    more » « less
  2. null (Ed.)
    Software developers who want to start contributing to an Open Source Software (OSS) project often struggle to find appropriate first tasks. The voluntary, self-organizing distribution of decentralized labor and the distinct nature of some OSS projects intensifies this challenge. Mentors, who work closely with newcomers, develop strategies to recommend tasks. However, to date neither the challenges mentors face in recommending tasks nor their strategies have been formally documented or studied. In this paper, we interviewed mentors of well-established OSS projects (n=10) and qualitatively analyzed their answers to identify both challenges and strategies related to recommending tasks for newcomers. Then, we employed a survey (n=30) to map the strategies to challenges and collect additional strategies. Our study identified 7 challenges and 13 strategies related to task recommendation. Strategies such as “tagging the issues based on difficulty,” “adding documentation,” “assigning a small task first and then challenge the newcomers with bigger tasks,” and “dividing tasks into smaller pieces” were frequently mentioned as ways to overcome multiple challenges. Our results provide insights for mentors about the strategies OSS communities can use to guide their mentors and for tool builders who design automated support for task assignment. 
    more » « less
  3. Open source software (OSS), a form of Digital or Knowledge Commons, underlies much of the technology that we use in our daily lives. The existence and continuation of OSS relies on the contribution of private resources – personal time, volunteer energy, and effort of numerous actors (e.g., software developers’ time as a common-pool resource) – to public goods, the benefits of which are enjoyed by everyone. Nonprofit organizations such as the Apache Software Foundation (ASF) attempt to aid this process by providing various collective services to OSS projects, acting as a second-order actor in the production of the public good. To this end, the ASF Incubator has created policies – essentially rules or norms – that serve to protect its interests and, as they say, increase the sustainability of the projects. Each policy requires investment by ASF (in terms of money or the use of volunteer time) or an incubating project (in terms of taking project personnel time), the benefits of which can accrue to either party. Such policies may impose additional costs on incubating projects, leading to a decreased production of the OSS public good. Using the ASF Incubator policy documents, we construct a dataset that records who – ASF or an incubating project – bears the cost and who enjoys the benefit of each policy and procedure. We can code most policy statements as costing one party and benefiting one party. The distribution of costs and benefits according to party indicates whether the second-order actor is contributing to an increase in the public good and if they are doing so sustainably. Through a two-way ANOVA, we characterize the impact of ASF policies on the production of public goods (OSS). Being a part of ASF imposes some costs on projects, but these costs may make projects more sustainable. Our analysis shows that the distribution of costs and benefits is fairly symmetric between the ASF and incubating projects. Thus, the configuration of policies or the “institutional design” of the ASF could aid in producing the OSS public good by providing services that projects require. 
    more » « less
  4. null (Ed.)
    Open Source Software (OSS) projects start with an initial vocabulary, often determined by the first generation of developers. This vocabulary, embedded in code identifier names and internal code comments, goes through multiple rounds of change, influenced by the interrelated patterns of human (e.g., developers joining and departing) and system (e.g., maintenance activities) interactions. Capturing the dynamics of this change is crucial for understanding and synthesizing code changes over time. However, existing code evolution analysis tools, available in modern version control systems such as GitHub and SourceForge, often overlook the linguistic aspects of code evolution. To bridge this gap, in this paper, we propose to study code evolution in OSS projects through the lens of developers' language, also known as code lexicon. Our analysis is conducted using 32 OSS projects sampled from a broad range of application domains. Our results show that different maintenance activities impact code lexicon differently. These insights lay out a preliminary foundation for modeling the linguistic history of OSS projects. In the long run, this foundation will be utilized to provide support for basic program comprehension tasks and help researchers gain new insights into the complex interplay between linguistic change and various system and human aspects of OSS development. 
    more » « less
  5. Women are underrepresented in Open Source Software (OSS) projects, as a result of which, not only do women lose career and skill development opportunities, but the projects themselves suffer from a lack of diversity of perspectives. Practitioners and researchers need to understand more about the phenomenon; however, studies about women in open source are spread across multiple fields, including information systems, software engineering, and social science. This paper systematically maps, aggregates, and synthesizes the state-of-the-art on women’s participation in OSS. It focuses on women contributors’ representation and demographics, how they contribute, their motivations and challenges, and strategies employed by communities to attract and retain women. We identified 51 articles (published between 2000 and 2021) that investigated women’s participation in OSS. We found evidence in these papers about who are the women who contribute, what motivates them to contribute, what types of contributions they make, challenges they face, and strategies proposed to support their participation. According to these studies, only about 5% of projects were reported to have women as core developers, and women authored less than 5% of pull-requests, but had similar or even higher rates of pull request acceptances than men. Women make both code and non-code contributions and their motivations to contribute include, learning new skills, altruism, reciprocity, and kinship. Challenges that women face in OSS are mainly social, including lack of peer parity and non-inclusive communication from a toxic culture. We found ten strategies reported in the literature, which we mapped to the reported challenges. Based on these results, we provide guidelines for future research and practice. 
    more » « less