Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available August 4, 2025
-
The adaptive bitrate selection (ABR) mechanism, which decides the bitrate for each video chunk is an important part of video streaming. There has been significant interest in developing Reinforcement-Learning (RL) based ABR algorithms because of their ability to learn efficient bitrate actions based on past data and their demonstrated improvements over wired, 3G and 4G networks. However, the Quality of Experience (QoE), especially video stall time, of state-of-the-art ABR algorithms including the RL-based approaches falls short of expectations over commercial mmWave 5G networks, due to widely and wildly fluctuating throughput. These algorithms find optimal policies for a multi-objective unconstrained problem where the policies inherently depend on the predefined weight parameters of the multiple objectives (e.g., bitrate maximization, stall-time minimization). Our empirical evaluation suggests that such a policy cannot adequately adapt to the high variations of 5G throughput, resulting in long stall times. To address these issues, we formulate the ABR selection problem as a constrained Markov Decision Process where the objective is to maximize the QoE subject to a stall-time constraint. The strength of this formulation is that it helps mitigate the stall time while maintaining high bitrates. We propose COREL, a primal-dual actor-critic RL algorithm, which incorporates an additional critic network to estimate stall time compared to existing RL-based approaches and can tune the optimal dual variable or weight to guide the policy towards minimizing stall time. Our experiment results across various commercial mmWave 5G traces reveal that COREL reduces the average stall time by a factor of 4 and the 95th percentile by a factor of 2.more » « less