skip to main content


Title: What Time is It? Student Modeling Needs to Know.
Abstract: Modeling student learning processes is highly complex since it is influenced by many factors such as motivation and learning habits. The high volume of features and tools provided by computer-based learning environments confounds the task of tracking student knowledge even further. Deep Learning models such as Long-Short Term Memory (LSTMs) and classic Markovian models such as Bayesian Knowledge Tracing (BKT) have been successfully applied for student modeling. However, much of this prior work is designed to handle sequences of events with discrete timesteps, rather than considering the continuous aspect of time. Given that time elapsed between successive elements in a student’s trajectory can vary from seconds to days, we applied a Timeaware LSTM (T-LSTM) to model the dynamics of student knowledge state in continuous time. We investigate the effectiveness of T-LSTM on two domains with very different characteristics. One involves an open-ended programming environment where students can self-pace their progress and T-LSTM is compared against LSTM, Recent Temporal Pattern Mining, and the classic Logistic Regression (LR) on the early prediction of student success; the other involves a classic tutor-driven intelligent tutoring system where the tutor scaffolds the student learning step by step and T-LSTM is compared with LSTM, LR, and BKT on the early prediction of student learning gains. Our results show that TLSTM significantly outperforms the other methods on the self-paced, open-ended programming environment; while on the tutor-driven ITS, it ties with LSTM and outperforms both LR and BKT. In other words, while time-irregularity exists in both datasets, T-LSTM works significantly better than other student models when the pace is driven by students. On the other hand, when such irregularity results from the tutor, T-LSTM was not superior to other models but its performance was not hurt either.  more » « less
Award ID(s):
1651909
NSF-PAR ID:
10214148
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
In Proceedings of the 13th International Conference on Educational Data Mining (EDM) 2020
Page Range / eLocation ID:
pp 171-182
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Early prediction of student difficulty during long-duration learning activities allows a tutoring system to intervene by providing needed support, such as a hint, or by alerting an instructor. To be e effective, these predictions must come early and be highly accurate, but such predictions are difficult for open-ended programming problems. In this work, Recent Temporal Patterns (RTPs) are used in conjunction with Support Vector Machine and Logistic Regression to build robust yet interpretable models for early predictions. We performed two tasks: to predict student success and difficulty during one, open-ended novice programming task of drawing a square-shaped spiral. We compared RTP against several machine learning models ranging from the classic to the more recent deep learning models such as Long Short Term Memory to predict whether students would be able to complete the programming task. Our results show that RTP-based models outperformed all others, and could successfully classify students after just one minute of a 20- minute exercise (students can spend more than 1 hour on it). To determine when a system might intervene to prevent incompleteness or eventual dropout, we applied RTP at regular intervals to predict whether a student would make progress within the next fi ve minutes, reflecting that they may be having difficulty. RTP successfully classifi ed these students needing interventions over 85% of the time, with increased accuracy using data-driven program features. These results contribute signi ficantly to the potential to build a fully data-driven tutoring system for novice programming. 
    more » « less
  2. Bayesian Knowledge Tracing (BKT) is a commonly used approach for student modeling, and Long Short Term Memory (LSTM) is a versatile model that can be applied to a wide range of tasks, such as language translation. In this work, we directly compared three models: BKT, its variant Intervention-BKT (IBKT), and LSTM, on two types of student modeling tasks: post-test scores prediction and learning gains prediction. Additionally, while previous work on student learning has often used skill/knowledge components identified by domain experts, we incorporated an automatic skill discovery method (SK), which includes a nonparametric prior over the exercise-skill assignments, to all three models. Thus, we explored a total of six models: BKT, BKT+SK, IBKT, IBKT+SK, LSTM, and LSTM+SK. Two training datasets were employed, one was collected from a natural language physics intelligent tutoring system named Cordillera, and the other was from a standard probability intelligent tutoring system named Pyrenees. Overall, our results showed that BKT and BKT+SK outperformed the others on predicting post-test scores, whereas LSTM and LSTM+SK achieved the highest accuracy, F1-measure, and area under the ROC curve (AUC) on predicting learning gains. Furthermore, we demonstrated that by combining SK with the BKT model, BKT+SK could reliably predict post-test scores using only the earliest 50% of the entire training sequences. For learning gain early prediction, using the earliest 70% of the entire sequences, LSTM can deliver a comparable prediction as using the entire training sequences. The findings yield a learning environment that can foretell students’ performance and learning gains early, and can render adaptive pedagogical strategy accordingly. 
    more » « less
  3. null (Ed.)
    Determining when and whether to provide personalized support is a well-known challenge called the assistance dilemma. A core problem in solving the assistance dilemma is the need to discover when students are unproductive so that the tutor can intervene. Such a task is particularly challenging for open-ended domains, even those that are well-structured with defined principles and goals. We present a set of datadriven methods to classify, predict, and prevent unproductive problem-solving steps in the well-structured open-ended domain of logic. This approach leverages and extends the Hint Factory, a set of methods that leverages prior student solution attempts to build data-driven intelligent tutors. We present a HelpNeed classification that uses prior student data to determine when students are likely to be unproductive and need help learning optimal problem-solving strategies. We present a controlled study to determine the impact of an Adaptive pedagogical policy that provides proactive hints at the start of each step based on the outcomes of our HelpNeed predictor: productive vs. unproductive. Our results show that the students in the Adaptive condition exhibited better training behaviors, with lower help avoidance, and higher help appropriateness (a higher chance of receiving help when it was likely to be needed), as measured using the HelpNeed classifier, when compared to the Control. Furthermore, the results show that the students who received Adaptive hints based on HelpNeed predictions during training significantly outperform their Control peers on the posttest, with the former producing shorter, more optimal solutions in less time. We conclude with suggestions on how these HelpNeed methods could be applied in other well-structured open-ended domains. 
    more » « less
  4. Abstract

    Advances in visual perceptual tasks have been mainly driven by the amount, and types, of annotations of large-scale datasets. Researchers have focused on fully-supervised settings to train models using offline epoch-based schemes. Despite the evident advancements, limitations and cost of manually annotated datasets have hindered further development for event perceptual tasks, such as detection and localization of objects and events in videos. The problem is more apparent in zoological applications due to the scarcity of annotations and length of videos-most videos are at most ten minutes long. Inspired by cognitive theories, we present a self-supervised perceptual prediction framework to tackle the problem of temporal event segmentation by building a stable representation of event-related objects. The approach is simple but effective. We rely on LSTM predictions of high-level features computed by a standard deep learning backbone. For spatial segmentation, the stable representation of the object is used by an attention mechanism to filter the input features before the prediction step. The self-learned attention maps effectively localize the object as a side effect of perceptual prediction. We demonstrate our approach on long videos from continuous wildlife video monitoring, spanning multiple days at 25 FPS. We aim to facilitate automated ethogramming by detecting and localizing events without the need for labels. Our approach is trained in an online manner on streaming input and requires only a single pass through the video, with no separate training set. Given the lack of long and realistic (includes real-world challenges) datasets, we introduce a new wildlife video dataset–nest monitoring of the Kagu (a flightless bird from New Caledonia)–to benchmark our approach. Our dataset features a video from 10 days (over 23 million frames) of continuous monitoring of the Kagu in its natural habitat. We annotate every frame with bounding boxes and event labels. Additionally, each frame is annotated with time-of-day and illumination conditions. We will make the dataset, which is the first of its kind, and the code available to the research community. We find that the approach significantly outperforms other self-supervised, traditional (e.g., Optical Flow, Background Subtraction) and NN-based (e.g., PA-DPC, DINO, iBOT), baselines and performs on par with supervised boundary detection approaches (i.e., PC). At a recall rate of 80%, our best performing model detects one false positive activity every 50 min of training. On average, we at least double the performance of self-supervised approaches for spatial segmentation. Additionally, we show that our approach is robust to various environmental conditions (e.g., moving shadows). We also benchmark the framework on other datasets (i.e., Kinetics-GEBD, TAPOS) from different domains to demonstrate its generalizability. The data and code are available on our project page:https://aix.eng.usf.edu/research_automated_ethogramming.html

     
    more » « less
  5. null (Ed.)
    We conducted a study to see if using Bayesian Knowledge Tracing (BKT) models would save time and problems in programming tutors. We used legacy data collected by two programming tutors to compute BKT models for every concept covered by each tutor. The novelty of our model was that slip and guess parameters were computed for every problem presented by each tutor. Next, we used cross-validation to evaluate whether the resulting BKT model would have reduced the number of practice problems solved and time spent by the students represented in the legacy data. We found that in 64.23% of the concepts, students would have saved time with the BKT model. The savings varied among concepts. Overall, students would have saved a mean of 1.28 minutes and 1.23 problems per concept. We also found that BKT models were more effective at saving time and problems on harder concepts. 
    more » « less