期刊文献+

基于Tile Coding编码和模型学习的Actor-Critic算法 被引量:3

Actor-Critic Algorithm Based on Tile Coding and Model Learning
下载PDF
导出
摘要 Actor-Critic是一类具有较好性能及收敛保证的强化学习方法,然而,Agent在学习和改进策略的过程中并没有对环境的动态性进行学习,导致Actor-Critic方法的性能受到一定限制。此外,Actor-Critic方法中需要近似地表示策略以及值函数,其中状态和动作的编码方法以及参数对Actor-Critic方法有重要的影响。Tile Coding编码具有简单易用、计算时间复杂度较低等优点,因此,将Tile Coding编码与基于模型的Actor-Critic方法结合,并将所得算法应用于强化学习仿真实验。实验结果表明,所得算法具有较好的性能。 The Actor-Critic(AC) approach is a class of reinforcement learning method which has good performance and ensures convergence,but the Agent does not study the dynamic of environment in the process of learning and improving policy,which causes the performance of the AC method to be restricted to a certain extent.In addition,the AC method needs to represent the policy and value function approximately,and the encoding methods of state and action and parameters have important influence on AC method.Tile Coding has advantages of simple and low computing time complexity,so we combined the Tile Coding with Actor-Critic method based on model and applied the algorithm to the simulation experiment on reinforcement learning,and the results show that the algorithm has good performance.
出处 《计算机科学》 CSCD 北大核心 2014年第6期239-242,249,共5页 Computer Science
基金 国家自然科学基金(61070122 61373094 61070223 61103045) 江苏省自然科学基金(BK2009116) 江苏省高校自然科学研究项目(09KJA520002)资助
关键词 强化学习 TILE CODING Actor-Critic 模型学习 函数逼近 Reinforcement learning Tile Coding Actor-Critic Model learning Function approximation
  • 相关文献

参考文献16

  • 1Sutton R S,Barto A G.Reinforcement Learning:An Introduction[M].MIT Press,1998.
  • 2Busoniu L,Babuska R,DeSchutter B,et al.Reimforcement Leaming and Dynamic Programming Using Function Approximators[M].Boca Raton,FL:CRC Press,2010.
  • 3Grondman I,Busoniu L,et al.A Survey of Actor-Critic Reinforcement Learning:Standard and Natural Policy Gradients[J].IEEE Transactions on Systems,Man,and Cybernetics—Part C:Applications and Reviews,2012,42(6):1291-1307.
  • 4Barto A G,Sutton R S,Anderson C W.Neuronlike Adaptive Element That Can Solve Difficult Learning Control Problems[J].IEEE Trans Syst Man Cybem,1983,13:834-846.
  • 5Konda V R,Tsitsiklis J N.Actor-Critic Algorithms[C]// Proceedings of Advances in Neural Information Processing Systems.2000.
  • 6Rosenstein M T,Barto A G.Supervised Learning Combined with an Actor-Critic Architecture[J].CMPSCI Technical Report 02-41.October 2002.
  • 7Peters J,Schaal S.Natural actor-critic[J].Neurocomputing,2008,71(7-9):1180-1190.
  • 8Bathnagar S,Sutton R S,Ghavamzadeh M,et al.Natural actor critic algorithms[J].Automatica,2009,45 (11):2471-2482.
  • 9Vamvoudakis K G,Lewis F L.Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem[J].Automatica,2010,46(5):878-888.
  • 10Grondman I,Vaandrager M,Busoniu L,et al.Efficient Model Learning Methods for Actor-Critic Control[J].IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics,2012,42(3):591-602.

同被引文献26

  • 1Li H G,Li Z,Robert T,et al.A real-time transportation prediction system[J].Applied Intelligence,2013,39(4):793-804.
  • 2Chen B,Cheng H H.A review of the applications of agent technology in traffic and transportation systems[J].IEEE Transactions on Intelligent Transportation Systems,2010,11(2):485-497.
  • 3Chen B,Cheng H H,Palen J.Integrating mobile agent technology with multi-agent systems for distributed traffic detection and management systems[J].Transportation Research Part C:Emerging Technologies,2009,17(1):1-10.
  • 4Bazzan A.Opportunities for multiagent systems and multiagent reinforcement learning in traffic control[J].Autonomous Agents and Multi-Agent Systems,2009,18(3):342-375.
  • 5Roozemond D A.Using intelligent agents for pro-active,real-time urban intersection control[J].European Journal of Operational Research,2001,131(2):293-301.
  • 6Cai C Q,Yang Z S.Study on urban traffic management based on multi-agent system[C]//Proceedings of the 6th International Conference on Machine Learning and Cybernetics,Hong Kong,China:IEEE,2007:25-29.
  • 7Chen C,Li Z J.A hierarchical networked urban traffic signal control system based on multi-agent[C]//Proceedings of the 9th IEEE International Conference on Networking,Sensing and Control(ICNSC).New York:IEEE,2012:28-33.
  • 8Srinivasan D,Choy M C,Cheu R L.Neural networks for realtime traffic signal control[J].IEEE Transactions on Intelligent Transportation Systems,2006,7(3):261-272.
  • 9Gregoire P,Desjardins C,Laumonier J,et al.Urban traffic control based on learning agents[C]//Proceedings of Intelligent Transportation Systems Conference.New York:IEEE,2007:916-921.
  • 10Weiring M A.Multi-agent reinforcement learning for traffic light control[C]//Proceedings of the 7th International Conference on Machine Learning(ICML2000).San Francisco:Morgan Kaufmann Publishers Incorporation,2000:1151-1158.

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部