skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: GazeGraph: graph-based few-shot cognitive context sensing from human visual behavior
In this work, we present GazeGraph, a system that leverages human gazes as the sensing modality for cognitive context sensing. GazeGraph is a generalized framework that is compatible with different eye trackers and supports various gaze-based sensing applications. It ensures high sensing performance in the presence of heterogeneity of human visual behavior, and enables quick system adaptation to unseen sensing scenarios with few-shot instances. To achieve these capabilities, we introduce the spatial-temporal gaze graphs and the deep learning-based representation learning method to extract powerful and generalized features from the eye movements for context sensing. Furthermore, we develop a few-shot gaze graph learning module that adapts the `learning to learn' concept from meta-learning to enable quick system adaptation in a data-efficient manner. Our evaluation demonstrates that GazeGraph outperforms the existing solutions in recognition accuracy by 45% on average over three datasets. Moreover, in few-shot learning scenarios, GazeGraph outperforms the transfer learning-based approach by 19% to 30%, while reducing the system adaptation time by 80%.  more » « less
Award ID(s):
1908051
PAR ID:
10296635
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 18th Conference on Embedded Networked Sensor Systems
Page Range / eLocation ID:
422 to 435
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent advances in eye tracking have given birth to a new genre of gaze-based context sensing applications, ranging from cognitive load estimation to emotion recognition. To achieve state-of-the-art recognition accuracy, a large-scale, labeled eye movement dataset is needed to train deep learning-based classifiers. However, due to the heterogeneity in human visual behavior, as well as the labor-intensive and privacy-compromising data collection process, datasets for gaze-based activity recognition are scarce and hard to collect. To alleviate the sparse gaze data problem, we present EyeSyn, a novel suite of psychology-inspired generative models that leverages only publicly available images and videos to synthesize a realistic and arbitrarily large eye movement dataset. Taking gaze-based museum activity recognition as a case study, our evaluation demonstrates that EyeSyn can not only replicate the distinct pat-terns in the actual gaze signals that are captured by an eye tracking device, but also simulate the signal diversity that results from different measurement setups and subject heterogeneity. Moreover, in the few-shot learning scenario, EyeSyn can be readily incorporated with either transfer learning or meta-learning to achieve 90% accuracy, without the need for a large-scale dataset for training. 
    more » « less
  2. Human gaze estimation is a widely used technique to observe human behavior. The rapid adaptation of deep learning techniques in gaze estimation has evolved human gaze estimation to many application domains. The retail industry is one domain with challenging unconstrained environmental conditions such as eye occlusion and personal calibration. This study presents a novel gaze estimation model for single-user 2D gaze estimation in a retail environment. Our novel architecture, inspired by the previous work in gaze following, models the scene and head feature and further utilizes a shifted grids technique to accurately predict a saliency map. Our results show that the model can effectively infer 2D gaze in a retail environment. We achieve state-of-the-art performance on Gaze On Objects (GOO) dataset. The obtained results have shown 25.2° angular error for gaze estimation. Furthermore, we provide a detailed analysis of the GOO dataset and comprehensively analyze the selected model feature extractor to support our results. 
    more » « less
  3. In eye-tracked augmented and virtual reality (AR/VR), instantaneous and accurate hands-free selection of virtual elements is still a significant challenge. Though other methods that involve gaze-coupled head movements or hovering can improve selection times in comparison to methods like gaze-dwell, they are either not instantaneous or have difficulty ensuring that the user’s selection is deliberate. In this paper, we present EyeShadows, an eye gaze-based selection system that takes advantage of peripheral copies (shadows) of items that allow for quick selection and manipulation of an object or corresponding menus. This method is compatible with a variety of different selection tasks and controllable items, avoids the Midas touch problem, does not clutter the virtual environment, and is context sensitive. We have implemented and refined this selection tool for VR and AR, including testing with optical and video see-through (OST/VST) displays. Moreover, we demonstrate that this method can be used for a wide range of AR and VR applications, including manipulation of sliders or analog elements. We test its performance in VR against three other selection techniques, including dwell (baseline), an inertial reticle, and head-coupled selection. Results showed that selection with EyeShadows was significantly faster than dwell (baseline), outperforming in the select and search and select tasks by 29.8% and 15.7%, respectively, though error rates varied between tasks. 
    more » « less
  4. Emerging Virtual Reality (VR) displays with embedded eye trackers are currently becoming a commodity hardware (e.g., HTC Vive Pro Eye). Eye-tracking data can be utilized for several purposes, including gaze monitoring, privacy protection, and user authentication/identification. Identifying users is an integral part of many applications due to security and privacy concerns. In this paper, we explore methods and eye-tracking features that can be used to identify users. Prior VR researchers explored machine learning on motion-based data (such as body motion, head tracking, eye tracking, and hand tracking data) to identify users. Such systems usually require an explicit VR task and many features to train the machine learning model for user identification. We propose a system to identify users utilizing minimal eye-gaze-based features without designing any identification-specific tasks. We collected gaze data from an educational VR application and tested our system with two machine learning (ML) models, random forest (RF) and k-nearest-neighbors (kNN), and two deep learning (DL) models: convolutional neural networks (CNN) and long short-term memory (LSTM). Our results show that ML and DL models could identify users with over 98% accuracy with only six simple eye-gaze features. We discuss our results, their implications on security and privacy, and the limitations of our work. 
    more » « less
  5. Eye tracking has already made its way to current commercial wearable display devices, and is becoming increasingly important for virtual and augmented reality applications. However, the existing model-based eye tracking solutions are not capable of conducting very accurate gaze angle measurements, and may not be sufficient to solve challenging display problems such as pupil steering or eyebox expansion. In this paper, we argue that accurate detection and localization of pupil in 3D space is a necessary intermediate step in model-based eye tracking. Existing methods and datasets either ignore evaluating the accuracy of 3D pupil localization or evaluate it only on synthetic data. To this end, we capture the first 3D pupilgaze-measurement dataset using a high precision setup with head stabilization and release it as the first benchmark dataset to evaluate both 3D pupil localization and gaze tracking methods. Furthermore, we utilize an advanced eye model to replace the commonly used oversimplified eye model. Leveraging the eye model, we propose a novel 3D pupil localization method with a deep learning-based corneal refraction correction. We demonstrate that our method outperforms the state-of-the-art works by reducing the 3D pupil localization error by 47.5% and the gaze estimation error by 18.7%. Our dataset and codes can be found here: link. 
    more » « less