skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, January 16 until 2:00 AM ET on Friday, January 17 due to maintenance. We apologize for the inconvenience.


Title: Towards retina-quality VR video streaming: 15ms could save you 80% of your bandwidth
Virtual reality systems today cannot yet stream immersive, retina-quality virtual reality video over a network. One of the greatest challenges to this goal is the sheer data rates required to transmit retina-quality video frames at high resolutions and frame rates. Recent work has leveraged the decay of visual acuity in human perception in novel gaze-contingent video compression techniques. In this paper, we show that reducing the motion-to-photon latency of a system itself is a key method for improving the compression ratio of gaze-contingent compression. Our key finding is that a client and streaming server system with sub-15ms latency can achieve 5x better compression than traditional techniques while also using simpler software algorithms than previous work.  more » « less
Award ID(s):
1909212 2045714 2039070
PAR ID:
10346863
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ACM SIGCOMM Computer Communication Review
Volume:
52
Issue:
1
ISSN:
0146-4833
Page Range / eLocation ID:
10 to 19
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. While recent work explored streaming volumetric content on-demand, there is little effort on live volumetric video streaming that bears the potential of bringing more exciting applications than its on-demand counterpart. To fill this critical gap, in this paper, we propose MetaStream, which is, to the best of our knowledge, the first practical live volumetric content capture, creation, delivery, and rendering system for immersive applications such as virtual, augmented, and mixed reality. To address the key challenge of the stringent latency requirement for processing and streaming a huge amount of 3D data, MetaStream integrates several innovations into a holistic system, including dynamic camera calibration, edge-assisted object segmentation, cross-camera redundant point removal, and foveated volumetric content rendering. We implement a prototype of MetaStream using commodity devices and extensively evaluate its performance. Our results demonstrate that MetaStream achieves low-latency live volumetric video streaming at close to 30 frames per second on WiFi networks. Compared to state-of-the-art systems, MetaStream reduces end-to-end latency by up to 31.7% while improving visual quality by up to 12.5%. 
    more » « less
  2. In recent years, streamed 360° videos have gained popularity within Virtual Reality (VR) and Augmented Reality (AR) applications. However, they are of much higher resolutions than 2D videos, causing greater bandwidth consumption when streamed. This increased bandwidth utilization puts tremendous strain on the network capacity of the cloud providers streaming these videos. In this paper, we introduce L3BOU, a novel, three-tier distributed software framework that reduces cloud-edge bandwidth in the backhaul network and lowers average end-to-end latency for 360° video streaming applications. The L3BOU framework achieves low bandwidth and low latency by leveraging edge-based, optimized upscaling techniques. L3BOU accomplishes this by utilizing down-scaled MPEG-DASH-encoded 360° video data, known as Ultra Low Resolution (ULR) data, that the L3BOU edge applies distributed super-resolution (SR) techniques on, providing a high quality video to the client. L3BOU is able to reduce the cloud-edge backhaul bandwidth by up to a factor of 24, and the optimized super-resolution multi-processing of ULR data provides a 10-fold latency decrease in super resolution upscaling at the edge. 
    more » « less
  3. In Cloud 3D, such as Cloud Gaming and Cloud Virtual Reality (VR), image frames are rendered and compressed (encoded) in the cloud, and sent to the clients for users to view. For low latency and high image quality, fast, high compression rate, and high-quality image compression techniques are preferable. This paper explores computation time reduction techniques for learned image compression to make it more suitable for cloud 3D. More specifically, we employed slim (low-complexity) and application-specific AI models to reduce the computation time without degrading image quality. Our approach is based on two key insights: (1) as the frames generated by a 3D application are highly homogeneous, application-specific compression models can improve the rate-distortion performance over a general model; (2) many computer-generated frames from 3D applications are less complex than natural photos, which makes it feasible to reduce the model complexity to accelerate compression computation. We evaluated our models on six gaming image datasets. The results show that our approach has similar rate-distortion performance as a state-of-the-art learned image compression algorithm, while obtaining about 5x to 9x speedup and reducing the compression time to be less than 1 s (0.74s), bringing learned image compression closer to being viable for cloud 3D. Code is available at https://github.com/cloud-graphics-rendering/AppSpecificLIC. 
    more » « less
  4. In eye-tracked augmented and virtual reality (AR/VR), instantaneous and accurate hands-free selection of virtual elements is still a significant challenge. Though other methods that involve gaze-coupled head movements or hovering can improve selection times in comparison to methods like gaze-dwell, they are either not instantaneous or have difficulty ensuring that the user’s selection is deliberate. In this paper, we present EyeShadows, an eye gaze-based selection system that takes advantage of peripheral copies (shadows) of items that allow for quick selection and manipulation of an object or corresponding menus. This method is compatible with a variety of different selection tasks and controllable items, avoids the Midas touch problem, does not clutter the virtual environment, and is context sensitive. We have implemented and refined this selection tool for VR and AR, including testing with optical and video see-through (OST/VST) displays. Moreover, we demonstrate that this method can be used for a wide range of AR and VR applications, including manipulation of sliders or analog elements. We test its performance in VR against three other selection techniques, including dwell (baseline), an inertial reticle, and head-coupled selection. Results showed that selection with EyeShadows was significantly faster than dwell (baseline), outperforming in the select and search and select tasks by 29.8% and 15.7%, respectively, though error rates varied between tasks. 
    more » « less
  5. Abstract

    Research incorporating either eye-tracking technology or immersive technology (virtual reality and 360 video) into studying teachers’ professional noticing is recent. Yet, such technologies allow a better understanding of the embodied nature of professional noticing. Thus, the goal of the current study is to examine how teachers’ eye-gaze in immersive representations of practice correspond to their attending to children’s mathematics. Using a mixed methods approach, we incorporated eye-tracking technology embedded within a virtual reality environment to compare novice and expert teachers’ gaze duration with quality of professional noticing. Findings and results both corroborate and extend previous research evidence about important differences in professional noticing between expert and novice teachers. Specifically, the amount of experience, and thus familiarity, teachers have with being in a classroom may affect their physical movement in both real and virtual representations of practice. Additionally, findings and results emphasize the importance of teachers’ visual focus on students’ doing of mathematics across the classroom.

     
    more » « less