The development and application of deep learning method- ologies has grown within educational contexts in recent years. Perhaps attributable, in part, to the large amount of data that is made avail- able through the adoption of computer-based learning systems in class- rooms and larger-scale MOOC platforms, many educational researchers are leveraging a wide range of emerging deep learning approaches to study learning and student behavior in various capacities. Variations of recurrent neural networks, for example, have been used to not only pre- dict learning outcomes but also to study sequential and temporal trends in student data; it is commonly believed that they are able to learn high- dimensional representations of learning and behavioral constructs over time, such as the evolution of a students’ knowledge state while working through assigned content. Recent works, however, have started to dis- pute this belief, instead finding that it may be the model’s complexity that leads to improved performance in many prediction tasks and that these methods may not inherently learn these temporal representations through model training. In this work, we explore these claims further in the context of detectors of student affect as well as expanding on exist- ing work that explored benchmarks inmore »
This content will become publicly available on June 1, 2023
Deep Learning or Deep Ignorance? Comparing Untrained Recurrent Models in Educational Contexts
The development and application of deep learning method-
ologies has grown within educational contexts in recent years. Perhaps
attributable, in part, to the large amount of data that is made avail-
able through the adoption of computer-based learning systems in class-
rooms and larger-scale MOOC platforms, many educational researchers
are leveraging a wide range of emerging deep learning approaches to
study learning and student behavior in various capacities. Variations of
recurrent neural networks, for example, have been used to not only pre-
dict learning outcomes but also to study sequential and temporal trends
in student data; it is commonly believed that they are able to learn high-
dimensional representations of learning and behavioral constructs over
time, such as the evolution of a students’ knowledge state while working
through assigned content. Recent works, however, have started to dis-
pute this belief, instead finding that it may be the model’s complexity
that leads to improved performance in many prediction tasks and that
these methods may not inherently learn these temporal representations
through model training. In this work, we explore these claims further in
the context of detectors of student affect as well as expanding on exist-
ing work that explored benchmarks in knowledge tracing. Specifically,
we observe how well trained models perform compared to deep learning
networks where training is applied only more »
- Award ID(s):
- 1903304
- Publication Date:
- NSF-PAR ID:
- 10331809
- Journal Name:
- Proceedings of the 23rd International Conference on Artificial Intelligence in Education
- Page Range or eLocation-ID:
- in press
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The development and application of deep learning method- ologies has grown within educational contexts in recent years. Perhaps attributable, in part, to the large amount of data that is made avail- able through the adoption of computer-based learning systems in class- rooms and larger-scale MOOC platforms, many educational researchers are leveraging a wide range of emerging deep learning approaches to study learning and student behavior in various capacities. Variations of recurrent neural networks, for example, have been used to not only pre- dict learning outcomes but also to study sequential and temporal trends in student data; it is commonly believed that they are able to learn high- dimensional representations of learning and behavioral constructs over time, such as the evolution of a students' knowledge state while working through assigned content. Recent works, however, have started to dis- pute this belief, instead nding that it may be the model's complexity that leads to improved performance in many prediction tasks and that these methods may not inherently learn these temporal representations through model training. In this work, we explore these claims further in the context of detectors of student a ect as well as expanding on exist- ing work that explored benchmarksmore »
-
The increased usage of computer-based learning platforms and online tools in classrooms presents new opportunities to not only study the underlying constructs involved in the learning process, but also use this information to identify and aid struggling students. Many learning platforms, particularly those driving or supplementing instruction, are only able to provide aid to students who interact with the system. With this in mind, student persistence emerges as a prominent learning construct contributing to students success when learning new material. Conversely, high persistence is not always productive for students, where additional practice does not help the student move toward a state of mastery of the material. In this paper, we apply a transfer learning methodology using deep learning and traditional modeling techniques to study high and low representations of unproductive persistence. We focus on two prominent problems in the fields of educational data mining and learner analytics representing low persistence, characterized as student "stopout," and unproductive high persistence, operationalized through student "wheel spinning," in an effort to better understand the relationship between these measures of unproductive persistence (i.e. stopout and wheel spinning) and develop early detectors of these behaviors. We find that models developed to detect each within and across-assignmentmore »
-
Recent work on automated scoring of student responses in educational applications has shown gains in human-machine agreement from neural models, particularly recurrent neural networks (RNNs) and pre-trained transformer (PT) models. However, prior research has neglected investigating the reasons for improvement – in particular, whether models achieve gains for the “right” reasons. Through expert analysis of saliency maps, we analyze the extent to which models attribute importance to words and phrases in student responses that align with question rubrics. We focus on responses to questions that are embedded in science units for middle school students accessed via an online classroom system. RNN and PT models were trained to predict an ordinal score from each response’s text, and experts analyzed generated saliency maps for each response. Our analysis shows that RNN and PT-based models can produce substantially different saliency profiles while often predicting the same scores for the same student responses. While there is some indication that PT models are better able to avoid spurious correlations of high frequency words with scores, results indicate that both models focus on learning statistical correlations between scores and words and do not demonstrate an ability to learn key phrases or longer linguistic units corresponding tomore »
-
Spiking neural networks (SNNs) well support spatio-temporal learning and energy-efficient event-driven hardware neuromorphic processors. As an important class of SNNs, recurrent spiking neural networks (RSNNs) possess great computational power. However, the practical application of RSNNs is severely limited by challenges in training. Biologically-inspired unsupervised learning has limited capability in boosting the performance of RSNNs. On the other hand, existing backpropagation (BP) methods suffer from high complexity of unfolding in time, vanishing and exploding gradients, and approximate differentiation of discontinuous spiking activities when applied to RSNNs. To enable supervised training of RSNNs under a well-defined loss function, we present a novel Spike-Train level RSNNs Backpropagation (ST-RSBP) algorithm for training deep RSNNs. The proposed ST-RSBP directly computes the gradient of a rate-coded loss function defined at the output layer of the network w.r.t tunable parameters. The scalability of ST-RSBP is achieved by the proposed spike-train level computation during which temporal effects of the SNN is captured in both the forward and backward pass of BP. Our ST-RSBP algorithm can be broadly applied to RSNNs with a single recurrent layer or deep RSNNs with multiple feedforward and recurrent layers. Based upon challenging speech and image datasets including TI46, N-TIDIGITS, Fashion-MNIST and MNIST, ST-RSBPmore »