期刊文献+

融合环境模型与深度强化学习的游戏算法 被引量:2

The game′s algorithm based on integrated environment model and intensive learning
下载PDF
导出
摘要 针对基于深度强化学习的游戏算法在训练Agent过程存在的训练速度慢的问题,提出一种新的融合了环境模型与深度强化学习的游戏算法.该算法考虑到Agent以前执行过相关任务的经验的重要性,并将这些任务经验转化为环境模型.首先Agent从环境模型中学习,待Agent从环境模型中学习后,然后再与环境交互.这样可以让Agent少犯错误,加快Agent的学习速度.为了验证算法有效性,在ALE游戏平台上选取两部游戏进行两种算法的仿真实验.实验结果表明新的算法在选取的两部游戏中都可以有效的提高Agent的训练速度. A new game's algorithm in integrating environment model intensive learning is put forward based on the problem of the games algorithm of deep reinforcement learning in the process of Agent training, which includes the problem of low training speed. It has considered the importance of related tasks taken by the agent, and translated these tasks into environment model. Firstly, Agent learns from environmental model, then it interacts with the environment. In this way, there will be less errors made by agent, and the Agent's learning speed will be faster. In order to test the algorithm efficiency, the two games using two different algorithms have been selected in the ALE game platform to conduct the simulation experiments. The result demonstrates that the new algorithm selected from two games can improve the Agent's training speed efficiently.
作者 黄学雨 郭勤 HUANG Xueyu;GUO Qin(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)
出处 《江西理工大学学报》 CAS 2018年第3期84-89,共6页 Journal of Jiangxi University of Science and Technology
基金 江西省研究生创新专项资金项目(YC2016-S317)
关键词 深度强化学习 环境模型 ALE游戏平台 deep reinforcement learning environmental model the ALE game platform
  • 相关文献

参考文献4

二级参考文献25

  • 1李晓宇,张新峰,沈兰荪.一种确定径向基核函数参数的方法[J].电子学报,2005,33(B12):2459-2463. 被引量:28
  • 2Northrop L, Feiler P, Gabriel R P, et al. Ultra-large-scale Systems :the software challenge of the future [ J ]. Software Engineering Institute, 2006,5 ( 2 ) :297 - 304.
  • 3Yang Z, Cheng B H C,Stirewalt R E K,et al. An aspect-ori- ented approach to dynamic adaptation[ C]//Proceedings of the 1st Workshop on Self-Healing System. 2002:85 -92.
  • 4Reilly D, Taleb-Bendiab A, Laws A, et al. An instrumentation and control-based approach for distributed application man- agement and adaptation [ C ]//Proceedings of the 1st Work- shop on Self-Healing System. 2002:62 - 66.
  • 5Garlan D, Schmerl B. Model-based adaptation for self-healing systems [ C ]//Proceedings of the 1 st Workshop on Self-Heal- ing Systems. 2002:27 - 32.
  • 6Salehic M, Tahvildari L. Self-adaptive software: landscape and research challenges [ J ]. ACM Transactions on Autono- mous and Adaptive Systems, 2009,4 (2) : 1 - 42.
  • 7Jennings N R. An Agent-based approach for building corn-plex software systems[J]. Communication of the ACM, 2001,44(4) :35 -41.
  • 8Mao X J, Shan L J, Zhu H, et al. An adaptive casteship mechanism for developing multi-Agent systems [ J ]. Interna- tional Journal of Computer Applications in Technology, 2008,31 (1/2) :17 - 34.
  • 9Tesauro G. Reinforcements learning in autonomic eomputing: a manifesto and ease studies[ J]. IEEE Internet Computing, 2007,11, ( 1 ) :22 -30.
  • 10Watkins C J C H, Dayan P. Chip attach scheduling in semi- conductor assembly [ J ]. Journal of Industrial Engineering, 1992,8 (3/4) : 279 - 292.

共引文献110

同被引文献17

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部