skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: The Eye in Extended Reality: A Survey on Gaze Interaction and Eye Tracking in Head-worn Extended Reality
With innovations in the field of gaze and eye tracking, a new concentration of research in the area of gaze-tracked systems and user interfaces has formed in the field of Extended Reality (XR). Eye trackers are being used to explore novel forms of spatial human–computer interaction, to understand human attention and behavior, and to test expectations and human responses. In this article, we review gaze interaction and eye tracking research related to XR that has been published since 1985, which includes a total of 215 publications. We outline efforts to apply eye gaze for direct interaction with virtual content and design of attentive interfaces that adapt the presented content based on eye gaze behavior and discuss how eye gaze has been utilized to improve collaboration in XR. We outline trends and novel directions and discuss representative high-impact papers in detail.  more » « less
Award ID(s):
1800961
PAR ID:
10343531
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
ACM Computing Surveys
Volume:
55
Issue:
3
ISSN:
0360-0300
Page Range / eLocation ID:
1 to 39
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Virtual reality (VR) simulations have been adopted to provide controllable environments for running augmented reality (AR) experiments in diverse scenarios. However, insufficient research has explored the impact of AR applications on users, especially their attention patterns, and whether VR simulations accurately replicate these effects. In this work, we propose to analyze user attention patterns via eye tracking during XR usage. To represent applications that provide both helpful guidance and irrelevant information, we built a Sudoku Helper app that includes visual hints and potential distractions during the puzzle-solving period. We conducted two user studies with 19 different users each in AR and VR, in which we collected eye tracking data, conducted gaze-based analysis, and trained machine learning (ML) models to predict user attentional states and attention control ability. Our results show that the AR app had a statistically significant impact on enhancing attention by increasing the fixated proportion of time, while the VR app reduced fixated time and made the users less focused. Results indicate that there is a discrepancy between VR simulations and the AR experience. Our ML models achieve 99.3% and 96.3% accuracy in predicting user attention control ability in AR and VR, respectively. A noticeable performance drop when transferring models trained on one medium to the other further highlights the gap between the AR experience and the VR simulation of it. 
    more » « less
  2. Emerging Virtual Reality (VR) displays with embedded eye trackers are currently becoming a commodity hardware (e.g., HTC Vive Pro Eye). Eye-tracking data can be utilized for several purposes, including gaze monitoring, privacy protection, and user authentication/identification. Identifying users is an integral part of many applications due to security and privacy concerns. In this paper, we explore methods and eye-tracking features that can be used to identify users. Prior VR researchers explored machine learning on motion-based data (such as body motion, head tracking, eye tracking, and hand tracking data) to identify users. Such systems usually require an explicit VR task and many features to train the machine learning model for user identification. We propose a system to identify users utilizing minimal eye-gaze-based features without designing any identification-specific tasks. We collected gaze data from an educational VR application and tested our system with two machine learning (ML) models, random forest (RF) and k-nearest-neighbors (kNN), and two deep learning (DL) models: convolutional neural networks (CNN) and long short-term memory (LSTM). Our results show that ML and DL models could identify users with over 98% accuracy with only six simple eye-gaze features. We discuss our results, their implications on security and privacy, and the limitations of our work. 
    more » « less
  3. Extended reality (XR) technologies, such as virtual reality (VR) and augmented reality (AR), provide users, their avatars, and embodied agents a shared platform to collaborate in a spatial context. Although traditional face-to-face communication is limited by users’ proximity, meaning that another human’s non-verbal embodied cues become more difficult to perceive the farther one is away from that person, researchers and practitioners have started to look into ways to accentuate or amplify such embodied cues and signals to counteract the effects of distance with XR technologies. In this article, we describe and evaluate the Big Head technique, in which a human’s head in VR/AR is scaled up relative to their distance from the observer as a mechanism for enhancing the visibility of non-verbal facial cues, such as facial expressions or eye gaze. To better understand and explore this technique, we present two complimentary human-subject experiments in this article. In our first experiment, we conducted a VR study with a head-mounted display to understand the impact of increased or decreased head scales on participants’ ability to perceive facial expressions as well as their sense of comfort and feeling of “uncannniness” over distances of up to 10 m. We explored two different scaling methods and compared perceptual thresholds and user preferences. Our second experiment was performed in an outdoor AR environment with an optical see-through head-mounted display. Participants were asked to estimate facial expressions and eye gaze, and identify a virtual human over large distances of 30, 60, and 90 m. In both experiments, our results show significant differences in minimum, maximum, and ideal head scales for different distances and tasks related to perceiving faces, facial expressions, and eye gaze, and we also found that participants were more comfortable with slightly bigger heads at larger distances. We discuss our findings with respect to the technologies used, and we discuss implications and guidelines for practical applications that aim to leverage XR-enhanced facial cues. 
    more » « less
  4. There is a lack of datasets for visual-inertial odometry applications in Extended Reality (XR). To the best of our knowledge, there is no dataset available that is captured from an XR headset with a human as a carrier. To bridge this gap, we present a novel pose estimation dataset --- called HoloSet --- collected using Microsoft Hololens 2, which is a state-of-the-art head mounted device for XR. Potential applications for HoloSet include visual-inertial odometry, simultaneous localization and mapping (SLAM), and additional applications in XR that leverage visual-inertial data. HoloSet captures both macro and micro movements. For macro movements, the dataset consists of more than 66,000 samples of visual, inertial, and depth camera data in a variety of environments (indoor, outdoor) and scene setups (trails, suburbs, downtown) under multiple user action scenarios (walk, jog). For micro movements, the dataset consists of more than 12,000 samples of additional articulated hand depth camera images while a user plays games that exercise fine motor skills and hand-eye coordination. We present basic visualizations and high-level statistics of the data and outline the potential research use cases for HoloSet. 
    more » « less
  5. Abstract

    Eye tracking provides direct, temporally and spatially sensitive measures of eye gaze. It can capture visual attention patterns from infancy through adulthood. However, commonly used screen-based eye tracking (SET) paradigms are limited in their depiction of how individuals process information as they interact with the environment in “real life”. Mobile eye tracking (MET) records participant-perspective gaze in the context of active behavior. Recent technological developments in MET hardware enable researchers to capture egocentric vision as early as infancy and across the lifespan. However, challenges remain in MET data collection, processing, and analysis. The present paper aims to provide an introduction and practical guide to starting researchers in the field to facilitate the use of MET in psychological research with a wide range of age groups. First, we provide a general introduction to MET. Next, we briefly review MET studies in adults and children that provide new insights into attention and its roles in cognitive and socioemotional functioning. We then discuss technical issues relating to MET data collection and provide guidelines for data quality inspection, gaze annotations, data visualization, and statistical analyses. Lastly, we conclude by discussing the future directions of MET implementation. Open-source programs for MET data quality inspection, data visualization, and analysis are shared publicly.

     
    more » « less