skip to main content


Title: Deep Learning vs. Bayesian Knowledge Tracing: Student Models for Interventions.
Bayesian Knowledge Tracing (BKT) is a commonly used approach for student modeling, and Long Short Term Memory (LSTM) is a versatile model that can be applied to a wide range of tasks, such as language translation. In this work, we directly compared three models: BKT, its variant Intervention-BKT (IBKT), and LSTM, on two types of student modeling tasks: post-test scores prediction and learning gains prediction. Additionally, while previous work on student learning has often used skill/knowledge components identified by domain experts, we incorporated an automatic skill discovery method (SK), which includes a nonparametric prior over the exercise-skill assignments, to all three models. Thus, we explored a total of six models: BKT, BKT+SK, IBKT, IBKT+SK, LSTM, and LSTM+SK. Two training datasets were employed, one was collected from a natural language physics intelligent tutoring system named Cordillera, and the other was from a standard probability intelligent tutoring system named Pyrenees. Overall, our results showed that BKT and BKT+SK outperformed the others on predicting post-test scores, whereas LSTM and LSTM+SK achieved the highest accuracy, F1-measure, and area under the ROC curve (AUC) on predicting learning gains. Furthermore, we demonstrated that by combining SK with the BKT model, BKT+SK could reliably predict post-test scores using only the earliest 50% of the entire training sequences. For learning gain early prediction, using the earliest 70% of the entire sequences, LSTM can deliver a comparable prediction as using the entire training sequences. The findings yield a learning environment that can foretell students’ performance and learning gains early, and can render adaptive pedagogical strategy accordingly.  more » « less
Award ID(s):
1651909
NSF-PAR ID:
10136454
Author(s) / Creator(s):
Date Published:
Journal Name:
Journal of educational data mining
Volume:
10
Issue:
2
ISSN:
2157-2100
Page Range / eLocation ID:
28-54
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract: Modeling student learning processes is highly complex since it is influenced by many factors such as motivation and learning habits. The high volume of features and tools provided by computer-based learning environments confounds the task of tracking student knowledge even further. Deep Learning models such as Long-Short Term Memory (LSTMs) and classic Markovian models such as Bayesian Knowledge Tracing (BKT) have been successfully applied for student modeling. However, much of this prior work is designed to handle sequences of events with discrete timesteps, rather than considering the continuous aspect of time. Given that time elapsed between successive elements in a student’s trajectory can vary from seconds to days, we applied a Timeaware LSTM (T-LSTM) to model the dynamics of student knowledge state in continuous time. We investigate the effectiveness of T-LSTM on two domains with very different characteristics. One involves an open-ended programming environment where students can self-pace their progress and T-LSTM is compared against LSTM, Recent Temporal Pattern Mining, and the classic Logistic Regression (LR) on the early prediction of student success; the other involves a classic tutor-driven intelligent tutoring system where the tutor scaffolds the student learning step by step and T-LSTM is compared with LSTM, LR, and BKT on the early prediction of student learning gains. Our results show that TLSTM significantly outperforms the other methods on the self-paced, open-ended programming environment; while on the tutor-driven ITS, it ties with LSTM and outperforms both LR and BKT. In other words, while time-irregularity exists in both datasets, T-LSTM works significantly better than other student models when the pace is driven by students. On the other hand, when such irregularity results from the tutor, T-LSTM was not superior to other models but its performance was not hurt either. 
    more » « less
  2. In this work, we propose a video-based transfer learning approach for predicting problem outcomes of students working with an intelligent tutoring system (ITS). By analyzing a student's face and gestures, our method predicts the outcome of a student answering a problem in an ITS from a video feed. Our work is motivated by the reasoning that the ability to predict such outcomes enables tutoring systems to adjust interventions, such as hints and encouragement, and to ultimately yield improved student learning. We collected a large labeled dataset of student interactions with an intelligent online math tutor consisting of 68 sessions, where 54 individual students solved 2,749 problems. We will release this dataset publicly upon publication of this paper. It will be available at https://www.cs.bu.edu/faculty/betke/research/learning/. Working with this dataset, our transfer-learning challenge was to design a representation in the source domain of pictures obtained “in the wild” for the task of facial expression analysis, and transferring this learned representation to the task of human behavior prediction in the domain of webcam videos of students in a classroom environment. We developed a novel facial affect representation and a user-personalized training scheme that unlocks the potential of this representation. We designed several variants of a recurrent neural network that models the temporal structure of video sequences of students solving math problems. Our final model, named ATL-BP for Affect Transfer Learning for Behavior Prediction, achieves a relative increase in mean F -score of 50 % over the state-of-the-art method on this new dataset. 
    more » « less
  3. Feng, Mingyu ; Käser, Tanja ; Talukdar, Partha (Ed.)
    Recent research seeks to develop more comprehensive learner models for adaptive learning software. For example, models of reading comprehension built using data from students’ use of adaptive instructional software for mathematics have recently been developed. These models aim to deliver experiences that consider factors related to learning beyond performance in the target domain for instruction. We investigate the extent to which generalization is possible for a recently developed predictive model that seeks to infer students’ reading comprehension ability (as measured by end-of-year standardized test scores) using an introductory learning experience in Carnegie Learning’s MATHia intelligent tutoring system for mathematics. Building on a model learned on data from middle school students in a single school district in a mid-western U.S. state, using that state’s end-of-year English Language Arts (ELA) standardized test score as an outcome, we consider data from a school district in a south-eastern U.S. state as well as that state’s end-of-year ELA standardized test outcome. Generalization is explored by considering prediction performance when training and testing models on data from each of the individual school districts (and for their respective state’s test outcomes) as well as pooling data from both districts together. We conclude with discussion of investigations of some algorithmic fairness characteristics of the learned models. The results suggest that a model trained on data from the smaller of the two school districts considered may achieve greater fairness in its predictions over models trained on data from the other district or both districts, despite broad, overall similarities in some demographic characteristics of the two school districts. This raises interesting questions for future research on generalizing these kinds of models as well as on ensuring algorithmic fairness of resulting models for use in real-world adaptive systems for learning. 
    more » « less
  4. null (Ed.)
    Mobile devices are becoming a more common part of the education experience. Students can access their devices at any time to perform assignments or review material. Mobile apps can have the added advantage of being able to automatically grade student work and provide instantaneous feedback. However, numerous challenges remain in implementing effective mobile educational apps. One challenge is the small screen size of smartphones, which was a concern for a spatial visualization training app where students sketch isometric and orthographic drawings. This app was originally developed for iPads, but the wide prevalence of smartphones led to porting the software to iPhone and Android phones. The sketching assignments on a smartphone screen required more frequent zooming and panning, and one of the hypotheses of this study was that the educational effectiveness on smartphones was the same as on the larger screen sizes using iPad tablets. The spatial visualization mobile sketching app was implemented in a college freshman engineering graphics course to teach students how to sketch orthographic and isometric assignments. The app provides automatic grading and hint feedback to help students when they are stuck. Students in this pilot were assigned sketching problems as homework using their personal devices. Students were administered a pre- and post- spatial visualization test (PSVT-R, a reliable, well-validated instrument) to assess learning gains. The trial analysis focuses on students who entered the course with limited spatial visualization experience as identified based on a score of ≤70% on the PSVT:R since students entering college with low PSVT:R scores are at higher risk of dropping out of STEM majors. Among these low-performing students, those who used the app showed significant progress: (71%) raised their test scores above 70% bringing them out of the at-risk range for dropping out of engineering. While the PSVT:R test has been well validated, there are benefits to developing alternative methods of assessing spatial visualization skills. We developed an assembly pre- and post- test based upon a timed Lego™ exercise. At the start of the quarter, students were timed to see how long it would take them to build small lego sets using only visual instructions. Students were timed again on a different lego set after completion of the spatial visualization app. One benefit of the test was that it illustrated to the engineering students a skill that could be perceived as more relevant to their careers, and thus possibly increased their motivation for spatial visualization training. In addition, it may be possible to adapt the assembly test to elementary school grade levels where the PSVT:R test would not be suitable. Preliminary results show that the average lego build times decreased significantly after using the mobile app, indicating an improvement in students’ spatial reasoning skills. A comparison will also be done between normalized completion times on the assembly test and the PSVT:R tests in order to see how the assembly test compares to the “gold standard”. In addition to the PSVT-R instrument, a survey was conducted to evaluate student usage and their impressions of the app. Students found the app engaging, easy to use, and something they would do whenever they had “a free moment”. 95% of the students recommended the app to a friend if they are struggling with spatial visualization skills. This paper will describe the implementation of the mobile spatial visualization sketching app in a large college classroom, and highlight the app’s impact in increasing self-efficacy in spatial visualization and sketching 
    more » « less
  5. Deep Reinforcement Learning (DRL) has been shown to be a very powerful technique in recent years on a wide range of applications. Much of the prior DRL work took the online learning approach. However, given the challenges of building accurate simulations for modeling student learning, we investigated applying DRL to induce a pedagogical policy through an offiine approach. In this work, we explored the effectiveness of offiine DRL for pedagogical policy induction in an Intelligent Tutoring System. Generally speaking, when applying offiine DRL, we face two major challenges: one is limited training data and the other is the credit assignment problem caused by delayed rewards. In this work, we used Gaussian Processes to solve the credit assignment problem by estimating the inferred immediate rewards from the final delayed rewards. We then applied the DQN and Double-DQN algorithms to induce adaptive pedagogical strategies tailored to individual students. Our empirical results show that without solving the credit assignment problem, the DQN policy, although better than Double-DQN, was no better than a random policy. However, when combining DQN with the inferred rewards, our best DQN policy can outperform the random yet reasonable policy, especially for students with high pre-test scores. 
    more » « less