Knowledge components (KCs) have many applications. In computing education, knowing the demonstration of specific KCs has been challenging. This paper introduces an entirely data-driven approach for (i) discovering KCs and (ii) demonstrating KCs, using students’ actual code submissions. Our system is based on two expected properties of KCs: (i) generate learning curves following the power law of practice, and (ii) are predictive of response correctness. We train a neural architecture (named KC-Finder) that classifies the correctness of student code submissions and captures problem-KC relationships. Our evaluation on data from 351 students in an introductory Java course shows that the learned KCs can generate reasonable learning curves and predict code submission correctness. At the same time, some KCs can be interpreted to identify programming skills. We compare the learning curves described by our model to four baselines, showing that (i) identifying KCs with naive methods is a difficult task and (ii) our learning curves exhibit a substantially better curve fit. Our work represents a first step in solving the data-driven KC discovery problem in computing education.
more »
« less
Generating Feedback-Ladders for Logical Errors in Programming using Large Language Models
In feedback generation for logical errors in programming assignments, large language model (LLM)-based methods have shown great promise. These methods ask the LLM to generate feedback given the problem statement and a student¿½fs (buggy) submission. There are several issues with these types of methods. First, the generated feedback messages are often too direct in revealing the error in the submission and thus diminish valuable opportunities for the student to learn. Second, they do not consider the student¿½fs learning context, i.e., their previous submissions, current knowledge, etc. Third, they are not layered since existing methods use a single, shared prompt for all student submissions. In this paper, we explore using LLMs to generate a ``feedback-ladder'', i.e., multiple levels of feedback for the same problem-submission pair. We evaluate the quality of the generated feedback-ladder via a user study with students, educators, and researchers. We have observed diminishing effectiveness for higher-level feedback and higher-scoring submissions overall in the study. In practice, our method enables teachers to select an appropriate level of feedback to show to a student based on their personal learning context, or in a progressive manner to go more detailed if a higher-level feedback fails to correct the student¿½fs error.
more »
« less
- Award ID(s):
- 2215193
- PAR ID:
- 10598957
- Editor(s):
- Benjamin, Paaßen; Carrie, Demmans Epp
- Publisher / Repository:
- International Educational Data Mining Society
- Date Published:
- Format(s):
- Medium: X
- Right(s):
- Creative Commons Attribution 4.0 International
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Automated grading systems, or auto-graders, have become ubiquitous in programming education, and the way they generate feedback has become increasingly automated as well. However, there is insufficient evidence regarding auto-grader feedback’s effectiveness in improving student learning outcomes, in a way that differentiates students who utilized the feedback and students who did not. In this study, we fill this critical gap. Specifically, we analyze students’ interactions with auto-graders in an introductory Python programming course, offered at five community colleges in the United States. Our results show that students checking the feedback more frequently tend to get higher scores from their programming assignments overall. Our results also show that a submission that follows a student checking the feedback tends to receive a higher score than a submission that follows a student ignoring the feedback. Our results provide evidence on auto-grader feedback’s effectiveness, encourage their increased utilization, and call for future work to continue their evaluation in this age of automation.more » « less
-
SQL is a crucial language for managing relational database systems, and is an essential skill for individuals in roles such as researchers, developers, and business professionals who work with databases. However, learning SQL can be a challenge, presenting an opportunity to study the various methods students use to arrive at semantically equivalent SQL queries. In this study, we examined students’ SQL submissions to homework assignments in the Database Systems course offered to upper-level undergraduate and graduate students at the University of Illinois Urbana-Champaign during the Fall 2022 semester. Our goal was to understand how students arrive at SQL solutions and overcome challenges in the learning process by building on prior research on line chart visualizations that instructors can use to increase visibility on students who are struggling. However, a major limitation of this approach was the difficulty for instructors to sift through a large number of visuals representing each student’s performance on a SQL problem and generate action items at scale, especially when dealing with enrollments of over 700 students. To overcome this limitation, we developed a novel technique to generate textual representations of the student submission sequence using global sequence alignment scores and regular expression algorithms to further compact these submission sequences. This allows instructors to gain insights quickly, on an aggregate level, and in an automated manner, enabling them to identify students who may be struggling with SQL based on their submission sequence characteristics and take appropriate action to improve database education. Our study discovered common textual submission patterns and pattern elements, and we present our recommendations to instructors to improve database education based on these findings.more » « less
-
null (Ed.)Structured Query Language (SQL), the standard language for relational database management systems, is an essential skill for software developers, data scientists, and professionals who need to interact with databases. SQL is highly structured and presents diverse ways for learners to acquire this skill. However, despite the significance of SQL to other related fields, little research has been done to understand how students learn SQL as they work on homework assignments. In this paper, we analyze students' SQL submissions to homework problems of the Database Systems course offered at the University of Illinois at Urbana-Champaign. For each student, we compute the Levenshtein Edit Distances between every submission and their final submission to understand how students reached their final solution and how they overcame any obstacles in their learning process. Our system visualizes the edit distances between students' submissions to a SQL problem, enabling instructors to identify interesting learning patterns and approaches. These findings will help instructors target their instruction in difficult SQL areas for the future and help students learn SQL more effectively.more » « less
-
<p>The data was downloaded and captured through MBSE online learning modules. Deidentified learners' activities within the modules, such as clickstreams and assignments, were captured in the data/</p> <p>• All files here are student submissions to one or more of the modules in the MBSE program.</p> <p>• All user data has either been removed or redacted from the submission.</p> <p>• “Andrew Hurt” is not a student and none of these files came from him. He is the person who did the redaction.</p> <p>• The naming structure of the files is as follows: [Module number]-[Module Offering Date]-[Submission Number]-[Part Number of Submission]. Example: M5-040422-S1-Part3. This file is from Module 5, which was offered on April 4, 2022, it is submission 1, and part 3 of submission 1.</p> <p>• Note that submissions within or between modules are not necessarily connected to specific students. So “Submission 1” from module 5 is not the same user as “Submission 1” from module 6.</p> <p>• Not all submissions have multiple parts.</p> <p>• No .mdzip files (proprietary MagicDraw software files) have been included in this list.</p> <p>• If a module or folder in the module is missing content from a particular offering, it is because either no one submitted anything or because the file was a .mdzip file and was not downloaded.</p> <p> </p>more » « less
An official website of the United States government

