Incremental development is the process of writing a small snippet of code and testing it before moving on. For students in introductory programming courses, the value of incremental development is especially higher as they may suffer from more syntax errors, lack the proficiency to address complicated bugs, and may be more prone to frustration when struggling to correct code. However, to evaluate the effectiveness of interventions that aim to teach programming processes such as incremental development, we need to develop measures to assess such processes. In this paper, we present a way to measure incremental development. By qualitatively analyzing 15 student coding interviews, we identified common behaviors in the programming process that relate to incremental development. We then leveraged a dataset of over 1000 development sessions -- about 52,000 code snapshots at compilation time -- to automatically detect the common behaviors identified in our qualitative analysis. Finally, we crafted a formal metric, called the ``Measure of Incremental Development’' (MID), to quantify how effectively a student used incremental development during a programming session. The MID detects common non-incremental development patterns such as excessive debugging after large additions of code to automatically assess a sequence of snapshots. The MID aligns with human evaluations of incrementality with over 80% accuracy. Our metric enables new research directions and interventions focused on improving students' development practices.
more »
« less
Assisting Teaching Assistants with Automatic Code Corrections
Undergraduate Teaching Assistants(TAs) in Computer Science courses are often the first and only point of contact when a student gets stuck on a programming problem. But these TAs are often relative beginners themselves, both in programming and in teaching. In this paper, we examine the impact of availability of corrected code on TAs’ ability to find, fix, and address bugs in student code. We found that seeing a corrected version of the student code helps TAs debug code 29% faster, and write more accurate and complete student-facing explanations of the bugs (30% more likely to correctly address a given bug). We also observed that TAs do not generally struggle with the conceptual understanding of the underlying material. Rather, their difficulties seem more related to issues with working memory, attention, and overall high cognitive load.
more »
« less
- Award ID(s):
- 2214538
- PAR ID:
- 10636715
- Publisher / Repository:
- ACM
- Date Published:
- Journal Name:
- CHI conference proceedingsCHI Conference
- ISSN:
- 2159-6468
- ISBN:
- 9781450391573
- Page Range / eLocation ID:
- 1 to 18
- Format(s):
- Medium: X
- Location:
- New Orleans LA USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This innovative practice WIP paper describes our ongoing development and deployment of an online robotics education platform that highlighted a gap in providing an interactive, feedback-rich learning environment essential for mastering pro-gramming concepts in robotics, which they were not getting with the traditional code→ simulate→turn-in workflow. Since teaching resources are limited, students would benefit from feedback in real-time to find and fix their mistakes in the programming assignments. To integrate such automated feedback, this paper will focus on creating a system for unit testing while integrating it into the course workflow. We facilitate this real-time feedback by including unit testing in the design of programming assignments so students can understand and fix their errors on their own and without the prior help of instructors/TAs serving as a bottleneck. In line with the framework's personalized student-centered approach, this method makes it easier for students to revise and debug their programming work, encouraging hands-on learning. The updated course workflow, which includes unit tests, will strengthen the learning environment and make it more interactive so that students can learn how to program robots in a self-guided fashion.more » « less
-
To address the increasing demand for AI literacy, we introduced a novel active learning approach that leverages both teaching assistants (TAs) and generative AI to provide feedback during in-class exercises. This method was evaluated through two studies in separate Computer Science courses, focusing on the roles and impacts of TAs in this learning environment, as well as their collaboration with ChatGPT in enhancing student feedback. The studies revealed that TAs were effective in accurately determining students’ progress and struggles, particularly in areas such as “backtracking”, where students faced significant challenges. This intervention’s success was evident from high student engagement and satisfaction levels, as reported in an end-of-semester survey. Further findings highlighted that while TAs provided detailed technical assessments and identified conceptual gaps effectively, ChatGPT excelled in presenting clarifying examples and offering motivational support. Despite some TAs’ resistance to fully embracing the feedback guidelines-specifically their reluctance to provide encouragement-the collaborative feedback process between TAs and ChatGPT improved the quality of feedback in several aspects, including technical accuracy and clarity in explaining conceptual issues. These results suggest that integrating human and artificial intelligence in educational settings can significantly enhance traditional teaching methods, creating a more dynamic and responsive learning environment. Future research will aim to improve both the quality and efficiency of feedback, capitalizing on unique strengths of both human and AI to further advance educational practices in the field of computing.more » « less
-
The recent public releases of AI tools such as ChatGPT have forced computer science educators to reconsider how they teach. These tools have demonstrated considerable ability to generate code and answer conceptual questions, rendering them incredibly useful for completing CS coursework. While overreliance on AI tools could hinder students’ learning, we believe they have the potential to be a helpful resource for both students and instructors alike. We propose a novel system for instructor-mediated GPT interaction in a class discussion board. By automatically generating draft responses to student forum posts, GPT can help Teaching Assistants (TAs) respond to student questions in a more timely manner, giving students an avenue to receive fast, quality feedback on their solutions without turning to ChatGPT directly. Additionally, since they are involved in the process, instructors can ensure that the information students receive is accurate, and can provide students with incremental hints that encourage them to engage critically with the material, rather than just copying an AI-generated snippet of code. We utilize Piazza—a popular educational forum where TAs help students via text exchanges—as a venue for GPT-assisted TA responses to student questions. These student questions are sent to GPT-4 alongside assignment instructions and a customizable prompt, both of which are stored in editable instructor-only Piazza posts. We demonstrate an initial implementation of this system, and provide examples of student questions that highlight its benefits.more » « less
-
Novice programmers often face challenges in designing computational artifacts and fixing code errors, which can lead to task abandonment and over-reliance on external support. While research has explored effective meta-cognitive strategies to scaffold novice programmers' learning, it is essential to first understand and assess students' conceptual, procedural, and strategic/conditional programming knowledge at scale. To address this issue, we propose a three-model framework that leverages Large Language Models (LLMs) to simulate, classify, and correct student responses to programming questions based on the SOLO Taxonomy. The SOLO Taxonomy provides a structured approach for categorizing student understanding into four levels: Pre-structural, Uni-structural, Multi-structural, and Relational. Our results showed that GPT-4o achieved high accuracy in generating and classifying responses for the Relational category, with moderate accuracy in the Uni-structural and Pre-structural categories, but struggled with the Multi-structural category. The model successfully corrected responses to the Relational level. Although further refinement is needed, these findings suggest that LLMs hold significant potential for supporting computer science education by assessing programming knowledge and guiding students toward deeper cognitive engagement.more » « less
An official website of the United States government

