Educational VR may increase engagement and retention compared to traditional learning, for some topics or students. However, a student could still get distracted and disengaged due to stress, mind-wandering, unwanted noise, external alerts, etc. Student eye gaze can be useful for detecting distraction. For example, we previously considered gaze visualizations to help teachers understand student attention to better identify or guide distracted students. However, it is not practical for a teacher to monitor a large numbers of student indicators while teaching. To help filter students based on distraction level, we consider a deep learning approach to detect distraction from gaze data. The key aspects are: (1) we created a labeled eye gaze dataset (3.4M data points) from an educational VR environment, (2) we propose an automatic system to gauge a student's distraction level from gaze data, and (3) we apply and compare three deep neural classifiers for this purpose. A proposed CNN-LSTM classifier achieved an accuracy of 89.8\% for classifying distraction, per educational activity section, into one of three levels.
more »
« less
Supervised vs Unsupervised Learning on Gaze Data to Classify Student Distraction Level in an Educational VR Environment
Educational VR may help students by being more engaging or improving retention compared to traditional learning methods. However, a student can get distracted in a VR environment due to stress, mind-wandering, unwanted noise, external alerts, etc. Student eye gaze can be useful for detecting these distraction. We explore deep-learning-based approaches to detect distractions from gaze data. We designed an educational VR environment and trained three deep learning models (CNN, LSTM, and CNN-LSTM) to gauge a student’s distraction level from gaze data, using both supervised and unsupervised learning methods. Our results show that supervised learning provided better test accuracy compared to unsupervised learning methods.
more »
« less
- Award ID(s):
- 1815976
- PAR ID:
- 10338215
- Date Published:
- Journal Name:
- 2021 ACM Symposium on Spatial User Interaction
- Page Range / eLocation ID:
- 1 to 2
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Emerging Virtual Reality (VR) displays with embedded eye trackers are currently becoming a commodity hardware (e.g., HTC Vive Pro Eye). Eye-tracking data can be utilized for several purposes, including gaze monitoring, privacy protection, and user authentication/identification. Identifying users is an integral part of many applications due to security and privacy concerns. In this paper, we explore methods and eye-tracking features that can be used to identify users. Prior VR researchers explored machine learning on motion-based data (such as body motion, head tracking, eye tracking, and hand tracking data) to identify users. Such systems usually require an explicit VR task and many features to train the machine learning model for user identification. We propose a system to identify users utilizing minimal eye-gaze-based features without designing any identification-specific tasks. We collected gaze data from an educational VR application and tested our system with two machine learning (ML) models, random forest (RF) and k-nearest-neighbors (kNN), and two deep learning (DL) models: convolutional neural networks (CNN) and long short-term memory (LSTM). Our results show that ML and DL models could identify users with over 98% accuracy with only six simple eye-gaze features. We discuss our results, their implications on security and privacy, and the limitations of our work.more » « less
-
null (Ed.)Egocentric perception has grown rapidly with the advent of immersive computing devices. Human gaze prediction is an important problem in analyzing egocentric videos and has primarily been tackled through either saliency-based modeling or highly supervised learning. We quantitatively analyze the generalization capabilities of supervised, deep learning models on the egocentric gaze prediction task on unseen, out-of-domain data. We find that their performance is highly dependent on the training data and is restricted to the domains specified in the training annotations. In this work, we tackle the problem of jointly predicting human gaze points and temporal segmentation of egocentric videos without using any training data. We introduce an unsupervised computational model that draws inspiration from cognitive psychology models of event perception. We use Grenander's pattern theory formalism to represent spatial-temporal features and model surprise as a mechanism to predict gaze fixation points. Extensive evaluation on two publicly available datasets - GTEA and GTEA+ datasets-shows that the proposed model can significantly outperform all unsupervised baselines and some supervised gaze prediction baselines. Finally, we show that the model can also temporally segment egocentric videos with a performance comparable to more complex, fully supervised deep learning baselines.more » « less
-
The Unified Parkinson’s Disease Rating Scale (UPDRS) is used to recognize patients with Parkinson’s disease (PD) and rate its severity. The rating is crucial for disease progression monitoring and treatment adjustment. This study aims to advance the capabilities of PD management by developing an innovative framework that integrates deep learning with wearable sensor technology to enhance the precision of UPDRS assessments. We introduce a series of deep learning models to estimate UPDRS Part III scores, utilizing motion data from wearable sensors. Our approach leverages a novel Multi-shared-task Self-supervised Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) framework that processes raw gyroscope signals and their spectrogram representations. This technique aims to refine the estimation accuracy of PD severity during naturalistic human activities. Utilizing 526 min of data from 24 PD patients engaged in everyday activities, our methodology demonstrates a strong correlation of 0.89 between estimated and clinically assessed UPDRS-III scores. This model outperforms the benchmark set by single and multichannel CNN, LSTM, and CNN-LSTM models and establishes a new standard in UPDRS-III score estimation for free-body movements compared to recent state-of-the-art methods. These results signify a substantial step forward in bioengineering applications for PD monitoring, providing a robust framework for reliable and continuous assessment of PD symptoms in daily living settings.more » « less
-
null (Ed.)Abstract: Modeling student learning processes is highly complex since it is influenced by many factors such as motivation and learning habits. The high volume of features and tools provided by computer-based learning environments confounds the task of tracking student knowledge even further. Deep Learning models such as Long-Short Term Memory (LSTMs) and classic Markovian models such as Bayesian Knowledge Tracing (BKT) have been successfully applied for student modeling. However, much of this prior work is designed to handle sequences of events with discrete timesteps, rather than considering the continuous aspect of time. Given that time elapsed between successive elements in a student’s trajectory can vary from seconds to days, we applied a Timeaware LSTM (T-LSTM) to model the dynamics of student knowledge state in continuous time. We investigate the effectiveness of T-LSTM on two domains with very different characteristics. One involves an open-ended programming environment where students can self-pace their progress and T-LSTM is compared against LSTM, Recent Temporal Pattern Mining, and the classic Logistic Regression (LR) on the early prediction of student success; the other involves a classic tutor-driven intelligent tutoring system where the tutor scaffolds the student learning step by step and T-LSTM is compared with LSTM, LR, and BKT on the early prediction of student learning gains. Our results show that TLSTM significantly outperforms the other methods on the self-paced, open-ended programming environment; while on the tutor-driven ITS, it ties with LSTM and outperforms both LR and BKT. In other words, while time-irregularity exists in both datasets, T-LSTM works significantly better than other student models when the pace is driven by students. On the other hand, when such irregularity results from the tutor, T-LSTM was not superior to other models but its performance was not hurt either.more » « less
An official website of the United States government

