skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Toward Next-generation Volumetric Video Streaming with Neural-based Content Representations
Striking a balance between minimizing bandwidth consumption and maintaining high visual quality stands as the paramount objective in volumetric content delivery. However, achieving this ambitious target is a substantial challenge, especially for mobile devices with constrained computational resources, given the voluminous amount of 3D data to be streamed, strict latency requirements, and high computational load. Inspired by the advantages offered by neural radiance fields (NeRF), we propose, for the first time, to deliver volumetric videos by utilizing neural-based content representations. We delve deep into potential challenges and explore viable solutions for both video-on-demand (VOD) and live video streaming services, in terms of the end-to-end pipeline, real-time and high-quality streaming, rate adaptation, and viewport adaptation. Our preliminary results lend credence to the feasibility of our research proposition, offering a promising starting point for further investigation.  more » « less
Award ID(s):
2212296 2235049
PAR ID:
10467143
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400703393
Page Range / eLocation ID:
199 to 207
Format(s):
Medium: X
Location:
Madrid Spain
Sponsoring Org:
National Science Foundation
More Like this
  1. While recent work explored streaming volumetric content on-demand, there is little effort on live volumetric video streaming that bears the potential of bringing more exciting applications than its on-demand counterpart. To fill this critical gap, in this paper, we propose MetaStream, which is, to the best of our knowledge, the first practical live volumetric content capture, creation, delivery, and rendering system for immersive applications such as virtual, augmented, and mixed reality. To address the key challenge of the stringent latency requirement for processing and streaming a huge amount of 3D data, MetaStream integrates several innovations into a holistic system, including dynamic camera calibration, edge-assisted object segmentation, cross-camera redundant point removal, and foveated volumetric content rendering. We implement a prototype of MetaStream using commodity devices and extensively evaluate its performance. Our results demonstrate that MetaStream achieves low-latency live volumetric video streaming at close to 30 frames per second on WiFi networks. Compared to state-of-the-art systems, MetaStream reduces end-to-end latency by up to 31.7% while improving visual quality by up to 12.5%. 
    more » « less
  2. Accessing high-quality video content can be challenging due to insufficient and unstable network bandwidth. Recent advances in neural enhancement have shown promising results in improving the quality of degraded videos through deep learning. Neural-Enhanced Streaming (NES) incorporates this new approach into video streaming, allowing users to download low-quality video segments and then enhance them to obtain high-quality content without violating the playback of the video stream. We introduce BONES, an NES control algorithm that jointly manages the network and computational resources to maximize the quality of experience (QoE) of the user. BONES formulates NES as a Lyapunov optimization problem and solves it in an online manner with near-optimal performance, making it the first NES algorithm to provide a theoretical performance guarantee. Comprehensive experimental results indicate that BONES increases QoE by 5% to 20% over state-of-the-art algorithms with minimal overhead. Our code is available at https://github.com/UMass-LIDS/bones. 
    more » « less
  3. Emerging multimedia applications often use a wireless LAN (Wi-Fi) infrastructure to stream content. These Wi-Fi deployments vary vastly in terms of their system configurations. In this paper, we take a step toward characterizing the Quality of Experience (QoE) of volumetric video streaming over an enterprise-grade Wi-Fi network to: (i) understand the impact of Wi-Fi control parameters on user QoE, (ii) analyze the relation between Quality of Service (QoS) metrics of Wi-Fi networks and application QoE, and (iii) compare the QoE of volumetric video streaming to traditional 2D video applications. We find that Wi-Fi configuration parameters such as channel width, radio interface, access category, and priority queues are important for optimizing Wi-Fi networks for streaming immersive videos. 
    more » « less
  4. Neural Radiance Field (NeRF) has emerged as a powerful technique for 3D scene representation due to its high rendering quality. Among its applications, mobile NeRF video-on-demand (VoD) is especially promising, beneting from both the scalability of the mobile devices and the immersive experience oered by NeRF. However, streaming NeRF videos over real-world networks presents signi cant challenges, particularly due to limited bandwidth and temporal dynamics. To address these challenges, we propose NeRFlow, a novel framework that enables adaptive streaming for NeRF videos through both bitrate and viewpoint adaptation. NeRFlow solves three fundamental problems: rst, it employs a rendering-adaptive pruning technique to determine voxel importance, selectively reducing data size without sacricing rendering quality. Second, it introduces a viewpoint-aware adaptation module that eciently compensates for uncovered regions in real time by combining preencoded master and sub-frames. Third, it incorporates a QoE-aware bitrate ladder generation framework, leveraging a genetic algorithm to optimize the number and conguration of bitrates while accounting for bandwidth dynamics and ABR algorithms. Through extensive experiments, NeRFlow is demonstrated to eectively improve user Quality of Experience (QoE) by 31.3% to 41.2%, making it an ecient solution for NeRF video streaming. 
    more » « less
  5. Super-resolution (SR) is a well-studied technique for reconstructing high-resolution (HR) images from low-resolution (LR) ones. SR holds great promise for video streaming since an LR video segment can be transmitted from the video server to the client that then reconstructs the HR version using SR, resulting in a significant reduction in network bandwidth. However, SR is seldom used in practice for real-time video streaming, because the computational overhead of frame reconstruction results in large latency and low frame rate. To reduce the computational overhead and make SR practical, we propose a deep-learning-based SR method called Fo veated Cas caded Video Super Resolution (focas). focas relies on the fact that human eyes only have high acuity in a tiny central foveal region of the retina. focas uses more neural network blocks in the foveal region to provide higher video quality, while using fewer blocks in the periphery as lower quality is sufficient. To optimize the computational resources and reduce reconstruction latency, focas formulates and solves a convex optimization problem to decide the number of neural network blocks to use in each region of the frame. Using extensive experiments, we show that focas reduces the latency by 50%-70% while maintaining comparable visual quality as traditional (non-foveated) SR. Further, focas provides a 12-16x reduction in the client-to-server network bandwidth in comparison with sending the full HR video segments. 
    more » « less