skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Student Teamwork on Programming Projects What can GitHub logs show us?
Teamwork, often mediated by version control systems such as Git and Apache Subversion (SVN), is central to professional programming. As a consequence, many colleges are incorporating both collaboration and online development environments into their curricula even in introductory courses. In this research, we collected GitHub logs from two programming projects in two offerings of a CS2 Java programming course for computer science majors. Students worked in pairs for both projects (one optional, the other mandatory) in each year. We used the students’ GitHub history to classify the student teams into three groups, collaborative, cooperative, or solo-submit, based on the division of labor. We then calculated different metrics for students’ teamwork including the total number and the average number of commits in different parts of the projects and used these metrics to predict the students’ teamwork style. Our findings show that we can identify the students’ teamwork style automatically from their submission logs. This work helps us to better understand novices’ habits while using version control systems. These habits can identify the harmful working styles among them and might lead to the development of automatic scaffolds for teamwork and peer support in the future.  more » « less
Award ID(s):
1821475
PAR ID:
10392590
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Rafferty, Anna N.; Whitehill, Jacob; Cavalli-Sforza, Violetta; Romero, Cristobal
Date Published:
Journal Name:
Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020)
Page Range / eLocation ID:
409 - 416
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Cybersecurity continues to be a critical aspect within every computing division, especially in the realm of operating system (OS) development. The OS resides at the lower layer above the hardware in the computing hierarchy. If the layers above the OS are well hardened, a security flaw in the OS will compromise the resources in those higher layers. Although several learning resources and courses are available for OS security, they are taught in advanced UG or graduate-level computer security classes. In this work, we develop cybersecurity educational modules that instructors can adoptin their OS courses to emphasize security in OS while teaching its concepts. The goal of this work is to engage students in learning security aspects in OS, while learning its concepts. It will give students a good understanding of different security concepts and how they are implemented in the OS. Towards this, we develop security educational modules for an OS course that will be available to the instructors for adoption in their courses. These modules are designed to be used in a UG-level OS course. To work on these modules, students should be familiar with C programming and OS concepts taught in the class. The modules are intended to be completed within the course of a semester. To achieve this goal, we organize them into three mini-projects witheach can be completed within a few weeks. We chose xv6 as the platform due to its popularity as an educational OS for developing the modules. To develop the modules, we referred to the recent version of a popular OS textbook for the security concepts. The topics discussed in it include authentication, authorization, cryptography, and distributed system security. We kept our educational modules mostly aligned with these topics except distributed system security. We also included a module for implementing a defense mechanism against buffer-overflow attacks, a famous software vulnerability. We created three mini-projects for these modules, each accompanied by proper documentation and a GitHub repository. Two versions are created for each project, one for a student’s assignment available in the repository and another as a solution version for instructors. The first project implements a user authentication system in xv6. Students will implement various specifications such as password structure with encryption and programs such as useradd, passwd, whoami, and login. The implementation guidelines are provided in the documentation, along with skeleton code. The authorization project implements the Unix-style access control system. In this project, students will modify and create various structures and functions within the xv6 kernel. The last project is to build a defense mechanism against buffer-overflow using Address Space Layout Randomization (ASLR). Students are expected to implement a random number generator and modify the executable file loader in xv6. The submission for each project is expected to demonstrate the module behavior comparable to relevant systems present in production grade OS, such as Linux. 
    more » « less
  2. Lynch, Collin F.; Merceron, Agathe; Desmarais, Michel; Nkambou, Roger (Ed.)
    Students’ interactions with online tools can provide us with insights into their study and work habits. Prior research has shown that these habits, even as simple as the number of actions or the time spent on online platforms can distinguish between the higher performing students and low-performers. These habits are also often used to predict students’ performance in classes. One key feature of these actions that is often overlooked is how and when the students transition between different online platforms. In this work, we study sequences of student transitions between online tools in blended courses and identify which habits make the most difference between the higher and lower performing groups. While our results showed that most of the time students focus on a single tool, we were able to find patterns in their transitions to differentiate high and low performing groups. These findings can help instructors to provide procedural guidance to the students, as well as to identify harmful habits and make timely interventions. 
    more » « less
  3. Background and Context: GitHub has been recently used in Software Engineering (SE) classes to facilitate collaboration in student team projects as well as help teachers to evaluate the contributions of their students more objectively. Objective: We explore the benefits and drawbacks of using GitHub as a means for team collaboration and performance evaluation in large SE classes. Method: Our research method takes the form of a case study conducted in a senior level SE class with 91 students. Our study also includes entry and exit surveys, an exit interview, and a qualitative analysis of students’ commit behavior. Findings: Different teams adapt GitHub to their workflow differently. Furthermore, despite the steep learning curve, using GitHub should not affect the quality of students’ submissions. However, using GitHub metrics as a proxy for evaluating team performance can be risky. Implications: We provide several recommendations for integrating Web-based configuration management tools in SE classes. 
    more » « less
  4. Although there are tools to help developers understand the matching behaviors between a regular expression and a string, regular-expression related faults are still common. Learning developers’ behavior through the change history of regular expressions can identify common edit patterns, which can inform the creation of mutation and repair operators to assist with testing and fixing regular expressions. In this work, we explore how regular expressions evolve over time, focusing on the characteristics of regular expression edits, the syntactic and semantic difference of the edits, and the feature changes of edits. Our exploration uses two datasets. First, we look at GitHub projects that have a regular expression in their current version and look back through the commit logs to collect the regular expressions’ edit history. Second, we collect regular expressions composed by study participants during problem- solving tasks. Our results show that 1) 95% of the regular expressions from GitHub are not edited, 2) most edited regular expressions have a syntactic distance of 4-6 characters from their predecessors, 3) over 50% of the edits in GitHub tend to expand the scope of regular expression, and 4) the number of features used indicates the regular expression language usage increases over time. This work has implications for supporting regular expression repair and mutation to ensure test suite quality. 
    more » « less
  5. Assessing team software development projects is notoriously difficult and typically based on subjective metrics. To help make assessments more rigorous, we conducted an empirical study to explore relationships between subjective metrics based on peer and instructor assessments, and objective metrics based on GitHub and chat data. We studied 23 undergraduate software teams (n= 117 students) from two undergraduate computing courses at two North American research universities. We collected data on teams’ (a) commits and issues from their GitHub code repositories, (b) chat messages from their Slack and Microsoft Teams channels, (c) peer evaluation ratings from the CATME peer evaluation system, and (d) individual assignment grades from the courses. We derived metrics from (a) and (b) to measure both individual team members’contributionsto the team, and theequalityof team members’ contributions. We then performed Pearson analyses to identify correlations among the metrics, peer evaluation ratings, and individual grades. We found significant positive correlations between team members’ GitHub contributions, chat contributions, and peer evaluation ratings. In addition, the equality of teams’ GitHub contributions was positively correlated with teams’ average peer evaluation ratings and negatively correlated with the variance in those ratings. However, no such positive correlations were detected between the equality of teams’ chat contributions and their peer evaluation ratings. Our study extends previous research results by providing evidence that (a) team members’ chat contributions, like their GitHub contributions, are positively correlated with their peer evaluation ratings; (b) team members’ chat contributions are positively correlated with their GitHub contributions; and (c) the equality of team’ GitHub contributions is positively correlated with their peer evaluation ratings. These results lend further support to the idea that combining objective and subjective metrics can make the assessment of team software projects more comprehensive and rigorous. 
    more » « less