摘要
针对现有强化学习路由算法未能根据网络负载变化较好权衡动作探索和利用的问题,提出一种基于环境感知的自适应深度强化学习路由算法。依据智能体经验回放时的平均误差,动态调整ε-greedy策略以平衡探索和利用,引入启发式规则限制动作探索以积累正向经验,结合优先经验回放机制加速模型收敛,提升智能体收敛前后的网络吞吐量和数据交付率。仿真结果表明,部署了该算法的网络的吞吐量和数据交付率均高于部署了其它基准算法的网络环境。
Aiming at the problems that the state of the art reinforcement learning algorithms fail to balance the exploration and utilization of actions according to the change of network load,an adaptive deep reinforcement learning routing algorithm with environmental perception was proposed,which dynamically adjustedε-greedy strategy to balance the exploration and utilization,according to the average error through agent experience playback.Some heuristic rules were used to limit exploration space to accumulate positive experience and the priority experience playback mechanism was introduced to accelerate model convergence,in this way network throughput and data delivery rate were improved during the whole process of agent learning.The simulation experiments indicate that the throughput and data delivery rate of the network environment deployed the proposed algorithm are higher than that of the ones deployed other benchmark algorithms.
作者
李婧
侯诗琪
LI Jing;HOU Shi-qi(College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 201306,China)
出处
《计算机工程与设计》
北大核心
2023年第11期3230-3237,共8页
Computer Engineering and Design
基金
国家自然科学基金项目(61872230、61572311)。
关键词
软件定义网络
智能路由
路由选择
深度强化学习
优先经验回放
自适应
吞吐量
software define network
intelligent routing
routing selection
deep reinforcement learning
prioritized experience replay
adaptability
throughput