skip to main content

Title: Deep Learning vs. Bayesian Knowledge Tracing: Student Models for Interventions.
Bayesian Knowledge Tracing (BKT) is a commonly used approach for student modeling, and Long Short Term Memory (LSTM) is a versatile model that can be applied to a wide range of tasks, such as language translation. In this work, we directly compared three models: BKT, its variant Intervention-BKT (IBKT), and LSTM, on two types of student modeling tasks: post-test scores prediction and learning gains prediction. Additionally, while previous work on student learning has often used skill/knowledge components identified by domain experts, we incorporated an automatic skill discovery method (SK), which includes a nonparametric prior over the exercise-skill assignments, to all three models. Thus, we explored a total of six models: BKT, BKT+SK, IBKT, IBKT+SK, LSTM, and LSTM+SK. Two training datasets were employed, one was collected from a natural language physics intelligent tutoring system named Cordillera, and the other was from a standard probability intelligent tutoring system named Pyrenees. Overall, our results showed that BKT and BKT+SK outperformed the others on predicting post-test scores, whereas LSTM and LSTM+SK achieved the highest accuracy, F1-measure, and area under the ROC curve (AUC) on predicting learning gains. Furthermore, we demonstrated that by combining SK with the BKT model, BKT+SK could reliably predict post-test scores more » using only the earliest 50% of the entire training sequences. For learning gain early prediction, using the earliest 70% of the entire sequences, LSTM can deliver a comparable prediction as using the entire training sequences. The findings yield a learning environment that can foretell students’ performance and learning gains early, and can render adaptive pedagogical strategy accordingly. « less
Award ID(s):
Publication Date:
Journal Name:
Journal of educational data mining
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract: Modeling student learning processes is highly complex since it is influenced by many factors such as motivation and learning habits. The high volume of features and tools provided by computer-based learning environments confounds the task of tracking student knowledge even further. Deep Learning models such as Long-Short Term Memory (LSTMs) and classic Markovian models such as Bayesian Knowledge Tracing (BKT) have been successfully applied for student modeling. However, much of this prior work is designed to handle sequences of events with discrete timesteps, rather than considering the continuous aspect of time. Given that time elapsed between successive elements inmore »a student’s trajectory can vary from seconds to days, we applied a Timeaware LSTM (T-LSTM) to model the dynamics of student knowledge state in continuous time. We investigate the effectiveness of T-LSTM on two domains with very different characteristics. One involves an open-ended programming environment where students can self-pace their progress and T-LSTM is compared against LSTM, Recent Temporal Pattern Mining, and the classic Logistic Regression (LR) on the early prediction of student success; the other involves a classic tutor-driven intelligent tutoring system where the tutor scaffolds the student learning step by step and T-LSTM is compared with LSTM, LR, and BKT on the early prediction of student learning gains. Our results show that TLSTM significantly outperforms the other methods on the self-paced, open-ended programming environment; while on the tutor-driven ITS, it ties with LSTM and outperforms both LR and BKT. In other words, while time-irregularity exists in both datasets, T-LSTM works significantly better than other student models when the pace is driven by students. On the other hand, when such irregularity results from the tutor, T-LSTM was not superior to other models but its performance was not hurt either.« less
  2. Mobile devices are becoming a more common part of the education experience. Students can access their devices at any time to perform assignments or review material. Mobile apps can have the added advantage of being able to automatically grade student work and provide instantaneous feedback. However, numerous challenges remain in implementing effective mobile educational apps. One challenge is the small screen size of smartphones, which was a concern for a spatial visualization training app where students sketch isometric and orthographic drawings. This app was originally developed for iPads, but the wide prevalence of smartphones led to porting the software tomore »iPhone and Android phones. The sketching assignments on a smartphone screen required more frequent zooming and panning, and one of the hypotheses of this study was that the educational effectiveness on smartphones was the same as on the larger screen sizes using iPad tablets. The spatial visualization mobile sketching app was implemented in a college freshman engineering graphics course to teach students how to sketch orthographic and isometric assignments. The app provides automatic grading and hint feedback to help students when they are stuck. Students in this pilot were assigned sketching problems as homework using their personal devices. Students were administered a pre- and post- spatial visualization test (PSVT-R, a reliable, well-validated instrument) to assess learning gains. The trial analysis focuses on students who entered the course with limited spatial visualization experience as identified based on a score of ≤70% on the PSVT:R since students entering college with low PSVT:R scores are at higher risk of dropping out of STEM majors. Among these low-performing students, those who used the app showed significant progress: (71%) raised their test scores above 70% bringing them out of the at-risk range for dropping out of engineering. While the PSVT:R test has been well validated, there are benefits to developing alternative methods of assessing spatial visualization skills. We developed an assembly pre- and post- test based upon a timed Lego™ exercise. At the start of the quarter, students were timed to see how long it would take them to build small lego sets using only visual instructions. Students were timed again on a different lego set after completion of the spatial visualization app. One benefit of the test was that it illustrated to the engineering students a skill that could be perceived as more relevant to their careers, and thus possibly increased their motivation for spatial visualization training. In addition, it may be possible to adapt the assembly test to elementary school grade levels where the PSVT:R test would not be suitable. Preliminary results show that the average lego build times decreased significantly after using the mobile app, indicating an improvement in students’ spatial reasoning skills. A comparison will also be done between normalized completion times on the assembly test and the PSVT:R tests in order to see how the assembly test compares to the “gold standard”. In addition to the PSVT-R instrument, a survey was conducted to evaluate student usage and their impressions of the app. Students found the app engaging, easy to use, and something they would do whenever they had “a free moment”. 95% of the students recommended the app to a friend if they are struggling with spatial visualization skills. This paper will describe the implementation of the mobile spatial visualization sketching app in a large college classroom, and highlight the app’s impact in increasing self-efficacy in spatial visualization and sketching« less
  3. Deep Reinforcement Learning (DRL) has been shown to be a very powerful technique in recent years on a wide range of applications. Much of the prior DRL work took the online learning approach. However, given the challenges of building accurate simulations for modeling student learning, we investigated applying DRL to induce a pedagogical policy through an offiine approach. In this work, we explored the effectiveness of offiine DRL for pedagogical policy induction in an Intelligent Tutoring System. Generally speaking, when applying offiine DRL, we face two major challenges: one is limited training data and the other is the credit assignmentmore »problem caused by delayed rewards. In this work, we used Gaussian Processes to solve the credit assignment problem by estimating the inferred immediate rewards from the final delayed rewards. We then applied the DQN and Double-DQN algorithms to induce adaptive pedagogical strategies tailored to individual students. Our empirical results show that without solving the credit assignment problem, the DQN policy, although better than Double-DQN, was no better than a random policy. However, when combining DQN with the inferred rewards, our best DQN policy can outperform the random yet reasonable policy, especially for students with high pre-test scores.« less
  4. There is increasing interest in how the pupil dynamics of the eye reflect underlying cognitive processes and brain states. Problematic, however, is that pupil changes can be due to non-cognitive factors, for example luminance changes in the environment, accommodation and movement. In this paper we consider how by modeling the response of the pupil in real-world environments we can capture the non-cognitive related changes and remove these to extract a residual signal which is a better index of cognition and performance. Specifically, we utilize sequence measures such as fixation position, duration, saccades, and blink-related information as inputs to a deepmore »recurrent neural network (RNN) model for predicting subsequent pupil diameter. We build and evaluate the model for a task where subjects are watching educational videos and subsequently asked questions based on the content. Compared to commonly-used models for this task, the RNN had the lowest errors rates in predicting subsequent pupil dilation given sequence data. Most importantly was how the model output related to subjects' cognitive performance as assessed by a post-viewing test. Consistent with our hypothesis that the model captures non-cognitive pupil dynamics, we found (1) the model's root-mean square error was less for lower performing subjects than for those having better performance on the post-viewing test, (2) the residuals of the RNN (LSTM) model had the highest correlation with subject post-viewing test scores and (3) the residuals had the highest discriminability (assessed via area under the ROC curve, AUC) for classifying high and low test performers, compared to the true pupil size or the RNN model predictions. This suggests that deep learning sequence models may be good for separating components of pupil responses that are linked to luminance and accommodation from those that are linked to cognition and arousal.« less
  5. Many coastal cities are facing frequent flooding from storm events that are made worse by sea level rise and climate change. The groundwater table level in these low relief coastal cities is an important, but often overlooked, factor in the recurrent flooding these locations face. Infiltration of stormwater and water intrusion due to tidal forcing can cause already shallow groundwater tables to quickly rise toward the land surface. This decreases available storage which increases runoff, stormwater system loads, and flooding. Groundwater table forecasts, which could help inform the modeling and management of coastal flooding, are generally unavailable. This study exploresmore »two machine learning models, Long Short-term Memory (LSTM) networks and Recurrent Neural Networks (RNN), to model and forecast groundwater table response to storm events in the flood prone coastal city of Norfolk, Virginia. To determine the effect of training data type on model accuracy, two types of datasets (i) the continuous time series and (ii) a dataset of only storm events, created from observed groundwater table, rainfall, and sea level data from 2010–2018 are used to train and test the models. Additionally, a real-time groundwater table forecasting scenario was carried out to compare the models’ abilities to predict groundwater table levels given forecast rainfall and sea level as input data. When modeling the groundwater table with observed data, LSTM networks were found to have more predictive skill than RNNs (root mean squared error (RMSE) of 0.09 m versus 0.14 m, respectively). The real-time forecast scenario showed that models trained only on storm event data outperformed models trained on the continuous time series data (RMSE of 0.07 m versus 0.66 m, respectively) and that LSTM outperformed RNN models. Because models trained with the continuous time series data had much higher RMSE values, they were not suitable for predicting the groundwater table in the real-time scenario when using forecast input data. These results demonstrate the first use of LSTM networks to create hourly forecasts of groundwater table in a coastal city and show they are well suited for creating operational forecasts in real-time. As groundwater table levels increase due to sea level rise, forecasts of groundwater table will become an increasingly valuable part of coastal flood modeling and management.« less