期刊文献+

基于强化学习的ICRA RoboMaster AI策略研究

Research on ICRA RoboMaster AI Strategy Based on ReinforcementLearning
下载PDF
导出
摘要 本文使用蒙特卡洛树搜索(MCTS)算法代替传统Alpha-Beta搜索算法,采用改变其他对称方面来训练非对称情况下的策略,研究对比和分析,认为确实可以找到一种通用的自我强化学习方法。 The reinforcement learning is applied to simulate the robot motion control,the introduction of PyGame virtual environment,the Q function of neural network training model,proposed the robot combat reinforcement learning model,the model of input for pixel(picture),the output as value function,to carry out the training model in the predefined environment,cIn this paper,the reinforcement learning is applied to simulate the robot motion control,the introduction of PyGame virtual environment,the Q function of neural network training model,proposed the robot combat reinforcement learning model,the model of input for pixel(picture),the output as value function,to carry out the training model in the predefined environment,core model is the application of neural network in the depth of Q learning using DQN algorithm simulation the function of the decision-making process in addition to,also in the process of training application and comparison between the method Actor-critic algorithm by comparing the difference between the two model output,carried on the thorough discussion and research In this paper,monte Carlo tree search(MCTS)algorithm is used to replace the traditional alpha-beta search algorithm,and other symmetric aspects are changed to train the asymmetric strategy.Through research comparison and analysis,it is shown that a general self-reinforcing learning method can indeed be foundore model is the application of neural network in the depth of Q learning using DQN algorithm simulation the function of the decision-making process in addition to,also in the process of training application and comparison between the method Actor-critic algorithm by comparing the difference between the two model output,carried on the thorough discussion and research.In this paper,monte Carlo tree search(MCTS)algorithm is used to replace the traditional alpha-beta search algorithm,and other symmetric aspects are changed to train the asymmetric strategy.Through research comparison and analysis,it is shown that a general self-reinforcing learning method can indeed be found.
作者 陈明阳 刘博 茆意风 Chen Mingyang;Liu Bo;Mao Yifeng(University of Pennsylvania,Pennsylvania 19019)
机构地区 宾夕法尼亚大学
出处 《中阿科技论坛(中英文)》 2020年第9期170-173,共4页 China-Arab States Science and Technology Forum
关键词 ICRA RoboMaster比赛 强化学习 DQN Actor-critic算法 ICRA robomaster Reinforce learning DQN Actor-critic

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部