期刊文献+

深度强化学习中状态注意力机制的研究 被引量:11

State attention in deep reinforcement learning
下载PDF
导出
摘要 虽然在深度学习与强化学习结合后,人工智能在棋类游戏和视频游戏等领域取得了超越人类水平的重大成就,但是实时策略性游戏星际争霸由于其巨大的状态空间和动作空间,对于人工智能研究者来说是一个巨大的挑战平台,针对Deepmind在星际争霸Ⅱ迷你游戏中利用经典的深度强化学习算法A3C训练出来的基线智能体的水平和普通业余玩家的水平相比还存在较大的差距的问题。通过采用更简化的网络结构以及把注意力机制与强化学习中的奖励结合起来的方法,提出基于状态注意力的A3C算法,所训练出来的智能体在个别星际迷你游戏中利用更少的特征图层取得的成绩最高,高于Deepmind的基线智能体71分。 Through artificial intelligence, significant achievements beyond the human level have been made in the field of board games and video games since the emergence of deep reinforcement learning. However, the real-time strategic game StarCraft is a huge challenging platform for artificial intelligence researchers due to its huge state space and action space. Considering that the level of baseline agents trained by DeepMind using classical deep reinforcement learning algorithm A3C in StarCraft Ⅱ mini-game is still far from that of ordinary amateur players, by adopting a more simplified network structure and combining the attention mechanism with rewards in reinforcement learning, an A3C algorithm based on state attention is proposed to solve this problem. The trained agent achieves the highest score, which is 71 points higher than Deepmind’s baseline agent in individual interplanetary mini games with fewer feature layers.
作者 申翔翔 侯新文 尹传环 SHEN Xiangxiang;HOU Xinwen;YIN Chuanhuan(Beijing Key Laboratory of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China;Center for Research on Intelligent System and Engineering,Institute of Automation,Chinese Academy of Sciences,Beijing 110016,China)
出处 《智能系统学报》 CSCD 北大核心 2020年第2期317-322,共6页 CAAI Transactions on Intelligent Systems
基金 中央高校基本科研业务费专项资金项目(2018JBZ006) 国家自然科学基金项目(61105056)。
关键词 深度学习 强化学习 注意力机制 A3C算法 星际争霸Ⅱ迷你游戏 智能体 微型操作 deep learning reinforcement learning attention mechanism A3C StarCraft Ⅱ mini-games agent micromanagement
  • 相关文献

同被引文献89

引证文献11

二级引证文献119

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部