skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Incentives and integration in scientific software production
Science policy makers are looking for approaches to increase the extent of collaboration in the production of scientific software, looking to open collaborations in open source software for inspiration. We examine the software ecosystem surrounding BLAST, a key bioinformatics tool, identifying outside improvements and interviewing their authors. We find that academic credit is a powerful motivator for the production and revealing of improvements. Yet surprisingly, we also find that improvements motivated by academic credit are less likely to be integrated than those with other motivations, including financial gain. We argue that this is because integration makes it harder to see who has contributed what and thereby undermines the ability of reputation to function as a reward for collaboration. We consider how open source avoids these issues and conclude with policy approaches to promoting wider collaboration by addressing incentives for integration.  more » « less
Award ID(s):
1111750 1064209 0943168
PAR ID:
10038302
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Computer Supported Cooperative Work
Page Range / eLocation ID:
459-470
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. ABSTRACT Institutional arrangements that guide collective action between entities create benefits and burdens for collaborating entities and can encourage cooperation or create coordination dilemmas. There is an abundance of research in public policy, public administration, and nonprofit management on cross‐sector alliances, co‐production, and collaborative networks. We contribute to advancing this research by introducing a methodological approach that combines two text‐based methods: institutional network analysis and cost–benefit analysis. We utilize the Institutional Grammar to code policy documents that govern relationships between actors. The coded text is then used to identify Networks of Prescribed Interactions to analyze institutional relationships between policy actors. We then utilize the coded text in a cost–benefit analysis to assess benefit and burden distributive effects. This integrated methodological framework provides researchers with a tool to elucidate both the institutional patterns of interaction and distributive implications embedded in policy documents, revealing insights that single‐method approaches cannot capture. We then utilize the coded text in a cost–benefit analysis to assess benefit and burden distributive effects. This integrated methodological framework provides researchers with a tool to elucidate both the institutional patterns of interaction and distributive implications embedded in policy documents, revealing insights that single‐method approaches cannot capture. To demonstrate the utility of this integrated approach, we examine the policy design of two nonprofit open‐source software (OSS) incubation programs with contrasting characteristics: the Apache Software Foundation (ASF) and the Open Source Geospatial Foundation (OSGeo). We select these cases because: (1) they are co‐production alliances and have policy documents that articulate support for collective action; (2) their policy documents and group discussions are open access, creating an opportunity to advance text‐based policy analysis methods; and (3) they represent juxtaposed examples of high and low risk for collaboration settings, thereby providing two illustrative cases of the combined network and cost–benefit text‐based methodological approach. The network analysis finds that ASF policies, as a high‐risk setting, emphasize bonding structures, particularly higher reciprocity, which creates a context for cooperation. OSGeo, a low‐risk setting, has policies creating a context for bridging structures, evident in high brokerage efficiency, to facilitate coordination. The cost–benefit analysis finds that ASF policies balance the distribution of costs and benefits between ASF and projects, while in OSGeo, projects bear both costs and benefits. These findings demonstrate that the combination of network and cost–benefit analysis is an effective tool for utilizing text to compare policy designs. 
    more » « less
  2. With the emergence of social coding platforms, collaboration has become a key and dynamic aspect to the success of software projects. In such platforms, developers have to collaborate and deal with issues of collaboration in open-source software development. Although collaboration is challenging, collaborative development produces better software systems than any developer could produce alone. Several approaches have investigated collaboration challenges, for instance, by proposing or evaluating models and tools to support collaborative work. Despite the undeniable importance of the existing efforts in this direction, there are few works on collaboration from perspectives of developers. In this work, we aim to investigate the perceptions of open-source software developers on collaborations, such as motivations, techniques, and tools to support global, productive, and collaborative development. Following an ad hoc literature review, an exploratory interview study with 12 open-source software developers from GitHub, our novel approach for this problem also relies on an extensive survey with 121 developers to confirm or refute the interview results. We found different collaborative contributions, such as managing change requests. Besides, we observed that most collaborators prefer to collaborate with the core team instead of their peers. We also found that most collaboration happens in software development (60%) and maintenance (47%) tasks. Furthermore, despite personal preferences to work independently, developers still consider collaborating with others in specific task categories, for instance, software development. Finally, developers also expressed the importance of the social coding platforms, such as GitHub, to support maintainers, and contributors in making decisions and developing tasks of the projects. Therefore, these findings may help project leaders optimize the collaborations among developers and reduce entry barriers. Moreover, these findings may support the project collaborators in understanding the collaboration process and engaging others in the project. 
    more » « less
  3. In this paper, we present a novel open-source electricity systems optimization tool--the Holistic Optimization Program for Electricity (HOPE)--to assess emerging generation technology, inform policy design, and support planning. With a highly transparent, interpretable and compact model design, HOPE easily allows user access and modification, serving its main goal to benefit users beyond engineer communities and facilitate collaboration across the science-policy boundary. By activating different modes, the current version of HOPE (v1.0) offers flexibility in serving as either a Generation and Transmission Expansion Planning tool (GTEP) or a Production Cost Modelling tool (PCM). It includes modelling features such as long-term resource investments, short-term system operations, and a detailed representation of policies across various levels of regulated institutions. This paper outlines the building blocks of the model and its software structure. Case study results from using HOPE for the state of Maryland as well as Pennsylvania-New Jersey-Maryland (PJM) footprint are also provided. 
    more » « less
  4. The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process, and collect recommendations to address these challenges. 
    more » « less
  5. Scientific progress relies crucially on software, yet in practice there are significant challenges to scientific software production and maintenance. We conducted a case study of a bioinformatics software library called Biopython to investigate the promise of Google Summer of Code (GSoC), a program that pays students to work on open-source projects for the summer, for addressing these challenges. We find three positive outcomes of GSoC in the Biopython community: the addition of new features to the Biopython codebase, training, and personal development. We also find, however, that mentors face several challenges related to GSoC project selection and ranking. We believe that because GSoC provides an occasion to extend the software with capabilities that can be used to produce new knowledge, and to train successive generations of potential contributors to the software, it can play a vital role in the sustainability of open-source scientific software. 
    more » « less