skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Thursday, February 13 until 2:00 AM ET on Friday, February 14 due to maintenance. We apologize for the inconvenience.


Title: Navigation Graph for Tiled Media Streaming
After the emergence of video streaming services, more creative and diverse multimedia content has become available, and now the capability of streaming 360-degree videos will open a new era of multimedia experiences. However, streaming these videos requires larger bandwidth and less latency than what is found in conventional video streaming systems. Rate adaptation of tiled videos and view prediction techniques are used to solve this problem. In this paper, we introduce the Navigation Graph, which models viewing behaviors in the temporal (segments) and the spatial (tiles) domains to perform the rate adaptation of tiled media associated with the view prediction. The Navigation Graph allows clients to perform view prediction more easily by sharing the viewing model in the same way in which media description information is shared in DASH. It is also useful for encoding the trajectory information in the media description file, which could also allow for more efficient navigation of 360-degree videos. This paper provides information about the creation of the Navigation Graph and its uses. The performance evaluation shows that the Navigation Graph based view prediction and rate adaptation outperform other existing tiled media streaming solutions. Navigation Graph is not limited to 360-degree video streaming applications, but it can also be applied to other tiled media streaming systems, such as volumetric media streaming for augmented reality applications.  more » « less
Award ID(s):
1900875
PAR ID:
10187414
Author(s) / Creator(s):
;
Date Published:
Journal Name:
ACM International Conference on Multimedia
Page Range / eLocation ID:
447 to 455
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Future view prediction for a 360-degree video streaming system is important to save the network bandwidth and improve the Quality of Experience (QoE). Historical view data of a single viewer and multiple viewers have been used for future view prediction. Video semantic information is also useful to predict the viewer's future behavior. However, extracting video semantic information requires powerful computing hardware and large memory space to perform deep learning-based video analysis. It is not a desirable condition for most of client devices, such as small mobile devices or Head Mounted Display (HMD). Therefore, we develop an approach where video semantic analysis is executed on the media server, and the analysis results are shared with clients via the Semantic Flow Descriptor (SFD) and View-Object State Machine (VOSM). SFD and VOSM become new descriptive additions of the Media Presentation Description (MPD) and Spatial Relation Description (SRD) to support 360-degree video streaming. Using the semantic-based approach, we design the Semantic-Aware View Prediction System (SEAWARE) to improve the overall view prediction performance. The evaluation results of 360-degree videos and real HMD view traces show that the SEAWARE system improves the view prediction performance and streams high-quality video with limited network bandwidth. 
    more » « less
  2. null (Ed.)
    We demonstrate a video 360 navigation and streaming system for Mobile HMD devices. The Navigation Graph (NG) concept is used to predict future views that use a graph model that captures both temporal and spatial viewing behavior of prior viewers. Visualization of video 360 content navigation and view prediction algorithms is used for assessment of Quality of Experience (QoE) and evaluation of the accuracy of the NG-based view prediction algorithm. 
    more » « less
  3. Bulterman_Dick ; Kankanhalli_Mohan ; Muehlhaueser_Max ; Persia_Fabio ; Sheu_Philip ; Tsai_Jeffrey (Ed.)
    The emergence of 360-video streaming systems has brought about new possibilities for immersive video experiences while requiring significantly higher bandwidth than traditional 2D video streaming. Viewport prediction is used to address this problem, but interesting storylines outside the viewport are ignored. To address this limitation, we present SAVG360, a novel viewport guidance system that utilizes global content information available on the server side to enhance streaming with the best saliency-captured storyline of 360-videos. The saliency analysis is performed offline on the media server with powerful GPU, and the saliency-aware guidance information is encoded and shared with clients through the Saliency-aware Guidance Descriptor. This enables the system to proactively guide users to switch between storylines of the video and allow users to follow or break guided storylines through a novel user interface. Additionally, we present a viewing mode prediction algorithms to enhance video delivery in SAVG360. Evaluation of user viewport traces in 360-videos demonstrate that SAVG360 outperforms existing tiled streaming solutions in terms of overall viewport prediction accuracy and the ability to stream high-quality 360 videos under bandwidth constraints. Furthermore, a user study highlights the advantages of our proactive guidance approach over predicting and streaming of where users look. 
    more » « less
  4. As virtual reality (VR) offers an unprecedented experience than any existing multimedia technologies, VR videos, or called 360-degree videos, have attracted considerable attention from academia and industry. How to quantify and model end users' perceived quality in watching 360-degree videos, or called QoE, resides the center for high-quality provisioning of these multimedia services. In this work, we present EyeQoE, a novel QoE assessment model for 360-degree videos using ocular behaviors. Unlike prior approaches, which mostly rely on objective factors, EyeQoE leverages the new ocular sensing modality to comprehensively capture both subjective and objective impact factors for QoE modeling. We propose a novel method that models eye-based cues into graphs and develop a GCN-based classifier to produce QoE assessment by extracting intrinsic features from graph-structured data. We further exploit the Siamese network to eliminate the impact from subjects and visual stimuli heterogeneity. A domain adaptation scheme named MADA is also devised to generalize our model to a vast range of unseen 360-degree videos. Extensive tests are carried out with our collected dataset. Results show that EyeQoE achieves the best prediction accuracy at 92.9%, which outperforms state-of-the-art approaches. As another contribution of this work, we have publicized our dataset on https://github.com/MobiSec-CSE-UTA/EyeQoE_Dataset.git. 
    more » « less
  5. null (Ed.)
    We investigate a novel communications system that integrates scalable multi-layer 360-degree video tiling, viewport-adaptive rate-distortion optimal resource allocation, and VR-centric edge computing and caching, to enable future high-quality untethered VR streaming. Our system comprises a collection of 5G small cells that can pool their communication, computing, and storage resources to collectively deliver scalable 360-degree video content to mobile VR clients at much higher quality. Our major contributions are rigorous design of multi-layer 360-degree tiling and related models of statistical user navigation, and analysis and optimization of edge-based multi-user VR streaming that integrates viewport adaptation and server cooperation. We also explore the possibility of network coded data operation and its implications for the analysis, optimization, and system performance we pursue here. We demonstrate considerable gains in delivered immersion fidelity, featuring much higher 360-degree viewport peak signal to noise ratio (PSNR) and VR video frame rates and spatial resolutions. 
    more » « less