skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 5:00 PM ET until 11:00 PM ET on Friday, June 21 due to maintenance. We apologize for the inconvenience.

Title: Grad: Learning for Overhead-aware Adaptive Video Streaming with Scalable Video Coding
Video streaming commonly uses Dynamic Adaptive Streaming over HTTP (DASH) to deliver good Quality of Experience (QoE) to users. Videos used in DASH are predominantly encoded by single-layered video coding such as H.264/AVC. In comparison, multi-layered video coding such as H.264/SVC provides more flexibility for up- grading the quality of buffered video segments and has the potential to further improve QoE. However, there are two challenges for us- ing SVC in DASH: (i) the complexity in designing ABR algorithms; and (ii) the negative impact of SVC’s coding overhead. In this work, we propose a deep reinforcement learning method called Grad for designing ABR algorithms that take advantage of the quality up- grade mechanism of SVC. Additionally, we quantify the impact of coding overhead on the achievable QoE of SVC in DASH, and propose jump-enabled hybrid coding (HYBJ) to mitigate the impact. Through emulation, we demonstrate that Grad-HYBJ, an ABR algo- rithm for HYBJ learned by Grad, outperforms the best performing state-of-the-art ABR algorithm by 17% in QoE.  more » « less
Award ID(s):
1763617 1901137
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
ACM Multimedia Conference (MM'20)
Page Range / eLocation ID:
349 to 357
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Achieving reliable acoustic wireless video transmissions in the extreme and uncertain underwater environment is a challenge due to the limited bandwidth and the error-prone nature of the channel. Aiming at optimizing the received video quality and the user's experience, an adaptive solution for underwater video transmissions is proposed that is specifically designed for Multi-Input Multi-Output (MIMO -based Software-Defined Acoustic Modems (SDAMs . To keep the video distortion under an acceptable threshold and to keep the Physical-Layer Throughput (PLT high, cross-layer techniques utilizing diversity-spatial multiplexing and Unequal Error Protection (UEP are presented along with the scalable video compression at the application layer. Specifically, the scalability of the utilized SDAM with high processing capabilities is exploited in the proposed structure along with the temporal, spatial, and quality scalability of the Scalable Video Coding (SVC H.264/MPEG-4 AVC compression standard. The transmitter broadcasts one video stream and realizes multicasting at different users. Experimental results at the Sonny Werblin Recreation Center, Rutgers University-NJ, are presented. Several scenarios for unknown channels at the transmitter are experimentally considered when the hydrophones are placed in different locations in the pool to achieve the required SVC-based video Quality of Service (QoS and Quality of Experience (QoE given the channel state information and the robustness of different SVC scalability. The video quality level is determined by the best communication link while the transmission scheme is decided based on the worst communication link, which guarantees that each user is able to receive the video with appropriate quality. 
    more » « less
  2. The adaptive bitrate selection (ABR) mechanism, which decides the bitrate for each video chunk is an important part of video streaming. There has been significant interest in developing Reinforcement-Learning (RL) based ABR algorithms because of their ability to learn efficient bitrate actions based on past data and their demonstrated improvements over wired, 3G and 4G networks. However, the Quality of Experience (QoE), especially video stall time, of state-of-the-art ABR algorithms including the RL-based approaches falls short of expectations over commercial mmWave 5G networks, due to widely and wildly fluctuating throughput. These algorithms find optimal policies for a multi-objective unconstrained problem where the policies inherently depend on the predefined weight parameters of the multiple objectives (e.g., bitrate maximization, stall-time minimization). Our empirical evaluation suggests that such a policy cannot adequately adapt to the high variations of 5G throughput, resulting in long stall times. To address these issues, we formulate the ABR selection problem as a constrained Markov Decision Process where the objective is to maximize the QoE subject to a stall-time constraint. The strength of this formulation is that it helps mitigate the stall time while maintaining high bitrates. We propose COREL, a primal-dual actor-critic RL algorithm, which incorporates an additional critic network to estimate stall time compared to existing RL-based approaches and can tune the optimal dual variable or weight to guide the policy towards minimizing stall time. Our experiment results across various commercial mmWave 5G traces reveal that COREL reduces the average stall time by a factor of 4 and the 95th percentile by a factor of 2. 
    more » « less
  3. Delivering videos under less-than-ideal network conditions without compromising end-users' quality of experiences is a hard problem. Virtually all prior work follow a piecemeal approach---either "tweaking" the fully reliable transport layer or making the client "smarter." We propose VOXEL, a cross-layer optimization system for video streaming. We use VOXEL to demonstrate how to combine application-provided "insights" with a partially reliable protocol for optimizing video streaming. To this end, we present a novel ABR algorithm that explicitly trades off losses for improving end-users' video-watching experiences. VOXEL is fully compatible with DASH, and backward-compatible with VOXEL-unaware servers and clients. In our experiments emulating a wide range of network conditions, VOXEL outperforms the state-of-the-art: We stream videos in the 90th-percentile with up to 97% less rebuffering than the state-of-the-art without sacrificing visual fidelity. We also demonstrate the benefits of VOXEL for small-buffer regimes like the emerging use case of low-latency and live streaming. In a survey of 54 real users, 84% of the participants indicated that they prefer videos streamed using VOXEL compared to the state-of-the-art. 
    more » « less
  4. Adaptive bitrate (ABR) algorithms aim to make optimal bitrate decisions in dynamically changing network conditions to ensure a high quality of experience (QoE) for the users during video streaming. However, most of the existing ABRs share the limitations of predefined rules and incorrect assumptions about streaming parameters. They also come short to consider the perceived quality in their QoE model, target higher bitrates regardless, and ignore the corresponding energy consumption. This joint approach results in additional energy consumption and becomes a burden, especially for mobile device users. This paper proposes GreenABR, a new deep reinforcement learning-based ABR scheme that optimizes the energy consumption during video streaming without sacrificing the user QoE. GreenABR employs a standard perceived quality metric, VMAF, and real power measurements collected through a streaming application. GreenABR's deep reinforcement learning model makes no assumptions about the streaming environment and learns how to adapt to the dynamically changing conditions in a wide range of real network scenarios. GreenABR outperforms the existing state-of-the-art ABR algorithms by saving up to 57% in streaming energy consumption and 60% in data consumption while achieving up to 22% more perceptual QoE due to up to 84% less rebuffering time and near-zero capacity violations. 
    more » « less
  5. Video signal transmission enables a wide range of applications in the underwater environment; such as coastal and tactical multimedia surveillance, undersea/offshore exploration, oil pipe/bridge inspection, video monitoring of geologica/biological processes from the seafloor to the air-sea interface-that all require real-time multimedia acquisition and classification. Yet, it is a challenge to achieve an efficient and reliable video transmission, due to the spectrum limitations underwater and also the error prone nature of the acoustic channel. In this paper, we propose a pairwise scheme to manage the video distortion-rate tradeoff for underwater video transmission. The proposed Multi-input Multi-output (MIMO)-based Software-Defined Acoustic Radio (SDAR) system adapts itself to meet the needs of both video compression and underwater channel in a timely manner from one hand, and keeps the overall video distortion-caused by the coder/decoder and channel-under an acceptable threshold from the other hand. The scalability of Universal Software Radio Peripheral (USRP) with high processing capabilities is exploited in the proposed structure along with the temporal, spatial and quality scalability of Scalable Video Coding (SVC) H.264/MPEG-4 AVC compression standard. Experimental results at Sonny Werblin Recreation Center, Rutgers University, as well as simulations are presented, while more experiments are in-progress to evaluate the performance of our testbed in more challenging environments such as in the Raritan River, New Jersey. 
    more » « less