基于强化学习的ICRA RoboMaster AI策略研究

Research on ICRA RoboMaster AI Strategy Based on ReinforcementLearning

下载PDF

导出

摘要本文使用蒙特卡洛树搜索(MCTS)算法代替传统Alpha-Beta搜索算法,采用改变其他对称方面来训练非对称情况下的策略,研究对比和分析,认为确实可以找到一种通用的自我强化学习方法。 The reinforcement learning is applied to simulate the robot motion control,the introduction of PyGame virtual environment,the Q function of neural network training model,proposed the robot combat reinforcement learning model,the model of input for pixel(picture),the output as value function,to carry out the training model in the predefined environment,cIn this paper,the reinforcement learning is applied to simulate the robot motion control,the introduction of PyGame virtual environment,the Q function of neural network training model,proposed the robot combat reinforcement learning model,the model of input for pixel(picture),the output as value function,to carry out the training model in the predefined environment,core model is the application of neural network in the depth of Q learning using DQN algorithm simulation the function of the decision-making process in addition to,also in the process of training application and comparison between the method Actor-critic algorithm by comparing the difference between the two model output,carried on the thorough discussion and research In this paper,monte Carlo tree search(MCTS)algorithm is used to replace the traditional alpha-beta search algorithm,and other symmetric aspects are changed to train the asymmetric strategy.Through research comparison and analysis,it is shown that a general self-reinforcing learning method can indeed be foundore model is the application of neural network in the depth of Q learning using DQN algorithm simulation the function of the decision-making process in addition to,also in the process of training application and comparison between the method Actor-critic algorithm by comparing the difference between the two model output,carried on the thorough discussion and research.In this paper,monte Carlo tree search(MCTS)algorithm is used to replace the traditional alpha-beta search algorithm,and other symmetric aspects are changed to train the asymmetric strategy.Through research comparison and analysis,it is shown that a general self-reinforcing learning method can indeed be found.

作者陈明阳刘博茆意风 Chen Mingyang;Liu Bo;Mao Yifeng(University of Pennsylvania,Pennsylvania 19019)

机构地区宾夕法尼亚大学

出处《中阿科技论坛（中英文）》 2020年第9期170-173,共4页 China-Arab States Science and Technology Forum

关键词 ICRA RoboMaster比赛强化学习 DQN Actor-critic算法 ICRA robomaster Reinforce learning DQN Actor-critic

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

中阿科技论坛（中英文）

2020年第9期

浏览历史

内容加载中请稍等...

基于强化学习的ICRA RoboMaster AI策略研究

相关作者

相关机构

相关主题

浏览历史