Learning Robust Adaptive Bitrate Algorithms with Adversarial Inverse Reinforcement Learning
-
Graphical Abstract
-
Abstract
Adaptive bitrate (ABR) algorithms are crucial for video streaming services, dynamically adjusting video bitrate based on current network conditions to ensure better quality of experience (QoE). However, traditional ABR algorithms often face challenges in adapting to diverse network environments and fail to fully utilize expert knowledge. In this study, we propose a novel approach using Adversarial Inverse Reinforcement Learning (AIRL) to learn ABR algorithms. Unlike traditional methods, AIRL can effectively leverage expert demonstrations to learn robust reward functions and generate stable ABR policies. Simultaneously, the learned ABR policy adjusts based on the updated reward function, aiming to closely emulate the video bitrate decision-making behavior of experts. Moreover, by decoupling the reward function, we can develop a more robust ABR strategy that can effectively adapt video bitrates to significant fluctuations in network conditions, while also optimizing different video QoE objectives. We conducted experiments across various network conditions, demonstrating that the proposed method exhibits stable and superior performance.
-
-