期刊文献+

RLAR:基于增强学习的自适应路由算法 被引量:1

RLAR:Adaptive routing algorithm based onreinforcement learning
下载PDF
导出
摘要 针对当前各种路由算法在广域网环境下由于不能适应各种拓扑环境和负载不均衡时所引起的路由性能不高等问题,提出了一种基于梯度上升算法实现的增强学习的自适应路由算法RLAR。增强学习意味着学习一种策略,即基于环境的反馈信息构造从状态到行为的映射,其本质为通过与环境的交互试验对策略集合进行评估。将增强学习策略运用于网络路由优化中,为路由研究提供了一种全新的思路。对比了多种现有的路由算法,实验结果表明,RLAR能有效提高网络路由性能。 Aimed at the poor performance of the current various routing algorithms,due to the poor adaptability to various changing net-work topologies and loads,an adaptive routing algorithm called RLAR is proposed,and the algorithm is based on reinforcement learning which implemented by gradient ascent algorithm.Reinforcement learning means learning a policy that a mapping of states into actions which based on feedback from the environment.The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment.Applying the reinforcement learning strategy to the research of routing,as a novel method,the theory is proved.The performance of RLAR and other routing methods is comprehensively compared,lots of simulation results show that RLAR can remarkably enhance the performance of network routing.
出处 《计算机工程与设计》 CSCD 北大核心 2011年第4期1190-1194,共5页 Computer Engineering and Design
基金 国家973重点基础研究发展计划基金项目(2005CB321801) 国家自然科学基金项目(60873215 60621003) 高等学校博士学科点专项科研基金项目(200899980003)
关键词 增强学习 路由 梯度上升 马尔可夫决策过程 自适应 reinforcement learning routing gradient ascent MDP adaptive
  • 相关文献

参考文献22

  • 1Humphrys M. Action selection methods using reinforcement learning[D].Cambridge:University of Cambridge, 1996.
  • 2Kaelbling L P.Reinforcement learning:A survey[C].Artificial In- telligence Research, 1996:237-285.
  • 3Tesauro G.J.Temporal difference learning and TD-gammon[J].Communications of the ACM,1995,38:58-68.
  • 4Crites R H,Barto A G.Elevator group control using multiple rein- forcement learning agents [J]. Machine Learning, 1998,32: 235-262.
  • 5Marbach P, Mihatsch O, Schulte M, et al. Reinforcementlearning for call admission control and routing in integratedservice networks[C].lEEE Conference on Decision and Con-tro1,1998.
  • 6Carlstrom J. Reinforcement learning for admisson control androuting [D].Uppsala,Sweden:Uppsala University,2000.
  • 7Brown T X, Tong H, Singh S P. Optimizing admission controlwhile ensuring quality of service in multimedia networks viareinforcement learning[J].Advances in Neural Information Pro-cessing Systems, 1999,12:982-988.
  • 8Boyan J,Littman M L.Packet routing in dynamically changingnetworks: A reinforcement learning approach [J]. Advances inNeural Information Processing Systems,1994,7:671-678.
  • 9Wolpert D H,Tumer K,Frank J.Using collective intelligence toroute intemet traffic[J].Advances in Neural Information Proces-sing Systems, 1998,11:952-958.
  • 10Nigel J B,Tao,Weaver L.A multi-agent,policy gradient approachto network routing[C].Proc of the Eighteenth International Con-ference on Machine Learning,2001.

同被引文献7

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部