skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interpreting Latent Student Knowledge Representations in Programming Assignments
Recent advances in artificial intelligence for education leverage generative large language models, including using them to predict open-ended student responses rather than their correctness only. However, the black-box nature of these models limits the interpretability of the learned student knowledge representations. In this paper, we conduct a first exploration into interpreting latent student knowledge representations by presenting InfoOIRT, an Information regularized Open-ended Item Response Theory model, which encourages the latent student knowledge states to be interpretable while being able to generate student-written code for open-ended programming questions. InfoOIRT maximizes the mutual information between a fixed subset of latent knowledge states enforced with simple prior distributions and generated student code, which encourages the model to learn disentangled representations of salient syntactic and semantic code features including syntactic styles, mastery of programming skills, and code structures. Through experiments on a real-world programming education dataset, we show that InfoOIRT can both accurately generate student code and lead to interpretable student knowledge representations.  more » « less
Award ID(s):
2215193
PAR ID:
10598956
Author(s) / Creator(s):
;
Editor(s):
Benjamin, Paaßen; Carrie, Demmans Epp
Publisher / Repository:
International Educational Data Mining Society
Date Published:
Format(s):
Medium: X
Right(s):
Creative Commons Attribution 4.0 International
Sponsoring Org:
National Science Foundation
More Like this
  1. Novice programmers can greatly improve their understanding of challenging programming concepts by studying worked examples that demonstrate the implementation of these concepts. Despite the extensive repositories of effective worked examples created by CS education experts, a key challenge remains: identifying the most relevant worked example for a given programming problem and the specific difficulties a student faces solving the problem. Previous studies have explored similar example recommendation approaches. Our research introduces a novel method by utilizing deep learning code representation models to generate code vectors, capturing both syntactic and semantic similarities among programming examples. Driven by the need to provide relevant and personalized examples to programming students, our approach emphasizes similarity assessment and clustering techniques to identify similar code problems, examples, and challenges. This method aims to deliver more accurate and contextually relevant recommendations based on individual learning needs. Providing tailored support to students in real-time facilitates better problem-solving strategies and enhances students' learning experiences, contributing to the advancement of programming education. 
    more » « less
  2. Automated analysis of programming data using code representation methods offers valuable services for programmers, from code completion to clone detection to bug detection. Recent studies show the effectiveness of Abstract Syntax Trees (AST), pre-trained Transformer-based models, and graph-based embeddings in programming code representation. However, pre-trained large language models lack interpretability, while other embedding-based approaches struggle with extracting important information from large ASTs. This study proposes a novel Subtree-based Attention Neural Network (SANN) to address these gaps by integrating different components: an optimized sequential subtree extraction process using Genetic algorithm optimization, a two-way embedding approach, and an attention network. We investigate the effectiveness of SANN by applying it to two different tasks: program correctness prediction and algorithm detection on two educational datasets containing both small and large-scale code snippets written in Java and C, respectively. The experimental results show SANN's competitive performance against baseline models from the literature, including code2vec, ASTNN, TBCNN, CodeBERT, GPT-2, and MVG, regarding accurate predictive power. Finally, a case study is presented to show the interpretability of our model prediction and its application for an important human-centered computing application, student modeling. Our results indicate the effectiveness of the SANN model in capturing important syntactic and semantic information from students' code, allowing the construction of accurate student models, which serve as the foundation for generating adaptive instructional support such as individualized hints and feedback. 
    more » « less
  3. Assessing student responses is a critical task in adaptive educational systems. More specifically, automatically evaluating students' self-explanations contributes to understanding their knowledge state which is needed for personalized instruction, the crux of adaptive educational systems. To facilitate the development of Artificial Intelligence (AI) and Machine Learning models for automated assessment of learners' self-explanations, annotated datasets are essential. In response to this need, we developed the SelfCode2.0 corpus, which consists of 3,019 pairs of student and expert explanations of Java code snippets, each annotated with semantic similarity, correctness, and completeness scores provided by experts. Alongside the dataset, we also provide performance results obtained with several baseline models based on TF-IDF and Sentence-BERT vectorial representations. This work aims to enhance the effectiveness of automated assessment tools in programming education and contribute to a better understanding and supporting student learning of programming. 
    more » « less
  4. Open-ended programming engages students by connecting computing with their real-world experience and personal interest. However, such open-ended programming tasks can be challenging, as they require students to implement features that they may be unfamiliar with. Code examples help students to generate ideas and implement program features, but students also encounter many learning barriers when using them. We explore how to design code examples to support novices' effective example use by presenting our experience of building and deploying Example Helper, a system that supports students with a gallery of code examples during open-ended programming. We deployed Example Helper in an undergraduate CS0 classroom to investigate students' example usage experience, finding that students used different strategies to browse, understand, experiment with, and integrate code examples and that students who make more sophisticated plans also used more examples in their projects. 
    more » « less
  5. Early prediction of student difficulty during long-duration learning activities allows a tutoring system to intervene by providing needed support, such as a hint, or by alerting an instructor. To be e effective, these predictions must come early and be highly accurate, but such predictions are difficult for open-ended programming problems. In this work, Recent Temporal Patterns (RTPs) are used in conjunction with Support Vector Machine and Logistic Regression to build robust yet interpretable models for early predictions. We performed two tasks: to predict student success and difficulty during one, open-ended novice programming task of drawing a square-shaped spiral. We compared RTP against several machine learning models ranging from the classic to the more recent deep learning models such as Long Short Term Memory to predict whether students would be able to complete the programming task. Our results show that RTP-based models outperformed all others, and could successfully classify students after just one minute of a 20- minute exercise (students can spend more than 1 hour on it). To determine when a system might intervene to prevent incompleteness or eventual dropout, we applied RTP at regular intervals to predict whether a student would make progress within the next fi ve minutes, reflecting that they may be having difficulty. RTP successfully classifi ed these students needing interventions over 85% of the time, with increased accuracy using data-driven program features. These results contribute signi ficantly to the potential to build a fully data-driven tutoring system for novice programming. 
    more » « less