期刊文献+

基于强化学习的DASH自适应码率决策算法研究 被引量:1

DASH Adaptive Bitrate Decision Algorithm Based on Reinforcement Learning
下载PDF
导出
摘要 目前的DASH客户端码率决策依赖基于特定环境的低准确性的建模来实现固定的控制算法,很难捕获和反映真实网络环境中动态网络的变化情况。本文采用了强化学习中的近端策略优化和深度神经网络相结合的算法,能够学习网络环境的动态变化特性做出决策,并根据价值网络输出调整策略网络的参数,逐渐收敛到最优策略。通过对真实网络轨迹数据集的实验证明:该算法比现有算法可获得更高的用户体验质量,具有较少的缓冲区下溢,并且保证了视频播放的流畅性。 The current client-based DASH bitrate decision relies on low-accuracy modeling based on a specific environment to implement a fixed contro algorithm,which is difficult to capture and reflect changes in the dynamic network in a real network environment.In this paper,the algorithm combining the proximal policy optimization in reinforcement learning and deep neural network is adopted.The algorithm can learn the dynamic characteristics of the network environment to make decisions,constantly update the policy network parameters based on the output of the value network to gradually converge to the optimal policy.Through experiments on real network trace datasets,the algorithm used in this paper can achieve higher user experience quality than existing algorithms,and has less buffer underflow,and ensures smooth video playback.
作者 冯苏柳 姜秀华 Feng Su-liu;Jiang Xiu-hua(School of Communication and Information Engineering Communication University of China,Beijing 100024,China)
出处 《中国传媒大学学报(自然科学版)》 2020年第2期59-64,83,共7页 Journal of Communication University of China:Science and Technology
关键词 自适应流媒体传输 DASH 深度强化学习 近端策略优化 HTTP adaptive streaming DASH deep reinforcement learning proximal policy optimization
  • 相关文献

同被引文献11

引证文献1

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部