skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Collaborative Writing on GitHub: A Case Study of a Book Project
Social coding platforms such as GitHub are increasingly becoming a digital workspace for the production of non-software digital artifacts. Since GitHub offers unique features that are different from traditional ways of collaborative writing, it is interesting to investigate how GitHub features are used for writing. In this paper, we present the preliminary findings of a mixed-methods, case study of collaboration practices in a GitHub book project. We found that the use of GitHub depended on task interdependence and audience participation. GitHub's direct push method was used to coordinate both loosely- and tightly-coupled work, with the latter requiring collaborators to follow socially-accepted conventions. The pull-based method was adopted once the project was released to the public. While face-to-face and online meetings were prominent in the early phases, GitHub's issues became instrumental for communication and project management in later phases. Our findings have implications for the design of collaborative writing tools.  more » « less
Award ID(s):
1633437
PAR ID:
10106648
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing
Page Range / eLocation ID:
305 to 308
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this chapter I report on a design research project of a digital collaborative platform for an embedded problem-based mathematics curriculum—the Connected Mathematics Project (CMP). The goal of the project is to enhance the teaching and teaming of mathematics that occurs in paper-and-pencil classrooms by leveraging the affordances of digital technologies in a digital classroom environment. In this chapter I share lessons learned for developing the digital collaborative platform for students and teachers, focusing on how the project team: (1) reimagined mathematics problems delivered in a digital collaborative platform, (2) supported a model of collaboration in the digital platform, and (3) provided students with just-in-time supports in the digital collaborative platform. To illustrate the lessons learned, I report on the iterative changes made to features of the digital collaborative platform based on analysis of project data and feedback from teachers and students. 
    more » « less
  2. null (Ed.)
    Theoretical and Empirical Modeling of Identity and Sentiments in Collaborative Groups (THEMIS.COG) was an interdisciplinary research collaboration of computer scientists and social scientists from the University of Waterloo (Canada), Potsdam University of Applied Sciences (Germany), and Dartmouth College (USA). This white paper summarizes the results of our research at the end of the grant term. Funded by the Trans-Atlantic Platform’s Digging Into Data initiative, the project aimed at theoretical and empirical modeling of identity and sentiments in collaborative groups. Understanding the social forces behind self-organized collaboration is important because technological and social innovations are increasingly generated through informal, distributed processes of collaboration, rather than in formal organizational hierarchies or through market forces. Our work used a data-driven approach to explore the social psychological mechanisms that motivate such collaborations and determine their success or failure. We focused on the example of GitHub, the world’s current largest digital platform for open, collaborative software development. In contrast to most, purely inductive contemporary approaches leveraging computational techniques for social science, THEMIS.COG followed a deductive, theory-driven approach. We capitalized on affect control theory, a mathematically formalized theory of symbolic interaction originated by sociologist David R. Heise and further advanced in previous work by some of the THEMIS.COG collaborators, among others. Affect control theory states that people control their social behaviours by intuitively attempting to verify culturally shared feelings about identities, social roles, and behaviour settings. From this principle, implemented in computational simulation models, precise predictions about group dynamics can be derived. It was the goal of THEMIS.COG to adapt and apply this approach to study the GitHub collaboration ecosystem through a symbolic interactionist lens. The project contributed substantially to the novel endeavor of theory development in social science based on large amounts of naturally occurring digital data. 
    more » « less
  3. Understanding “how to optimize the production of scientific knowledge” is paramount to those who support scientific research—funders as well as research institutions—to the communities served, and to researchers. Structured archives can help all involved to learn what decisions and processes help or hinder the production of new knowledge. Using artificial intelligence (AI) and large language models (LLMs), we recently created the first structured digital representation of the historic archives of the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health. This work yielded a digital knowledge base of entities, topics, and documents that can be used to probe the inner workings of the Human Genome Project, a massive international public-private effort to sequence the human genome, and several of its offshoots like The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE). The resulting knowledge base will be instrumental in understanding not only how the Human Genome Project and genomics research developed collaboratively, but also how scientific goals come to be formulated and evolve. Given the diverse and rich data used in this project, we evaluated the ethical implications of employing AI and LLMs to process and analyze this valuable archive. As the first computational investigation of the internal archives of a massive collaborative project with multiple funders and institutions, this study will inform future efforts to conduct similar investigations while also considering and minimizing ethical challenges. Our methodology and risk-mitigating measures could also inform future initiatives in developing standards for project planning, policymaking, enhancing transparency, and ensuring ethical utilization of artificial intelligence technologies and large language models in archive exploration.Author Contributions: Mohammad Hosseini: Investigation; Project Administration; Writing – original draft; Writing – review & editing. Spencer Hong: Conceptualization, Data curation, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. Thomas Stoeger: Conceptualization; Investigation; Project Administration; Supervision; Writing – original draft; Writing – review & editing. Kristi Holmes: Funding acquisition, Supervision, Writing – review & editing. Luis A. Nunes Amaral: Funding acquisition, Supervision, Writing – review & editing. Christopher Donohue: Conceptualization, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing. Kris Wetterstrand: Conceptualization, Funding acquisition, Project administration. 
    more » « less
  4. While open-source software has become ubiquitous, its sustainability is in question: without a constant supply of contributor effort, open-source projects are at risk. While prior work has extensively studied the motivations of open-source contributors in general, relatively little is known about how people choose which project to contribute to, beyond personal interest. This question is especially relevant in transparent social coding environments like GitHub, where visible cues on personal pro"le and repository pages, known as signals, are known to impact impression formation and decision making. In this paper, we report on a mixed-methods empirical study of the signals that influence the contributors’ decision to join a GitHub project. We first interviewed 15 GitHub contributors about their project evaluation processes and identified the important signals they used, including the structure of the README and the amount of recent activity. Then, we proceeded quantitatively to test out the impact of each signal based on the data of 9,977 GitHub projects. We reveal that many important pieces of information lack easily observable signals, and that some signals may be both attractive and unattractive. Our findings have direct implications for open-source maintainers and the design of social coding environments, e.g., features to be added to facilitate better project searching experience 
    more » « less
  5. This design case details a data science summer learning experience designed by University of Memphis faculty for HBCU students (NSF #: 1918751) with recruiting assistance provided by LeMoyne-Owen College. The summer learning experience included elements of didactic and collaborative problem-solving during the first five weeks of the internship, followed by a three-week, team-based, problem-solving project using real-world data. While the course was originally designed as a face-to-face learning experience, the impact of COVID-19 necessitated a shift toward online digital spaces. The design case details the opportunities and challenges of STEM online learning and especially underscores the limitations of (a) existing data science technologies for instruction, (b) the shift toward instructional design of materials that supported more self-directed learning, and (c) collaborative problem-solving. Implications for design and practice are also considered. 
    more » « less