摘要
随着机器学习的发展,深度强化学习凭借着能够对大规模输入进行自主探索试错从而学习到最优策略的优势成为研究热点。然而,传统的强化学习在针对复杂的决策任务时面临着维度灾难,并且无法解决稀疏奖励问题。文章提出一种融合Manager-Worker层次结构与强化学习经典算法深度Q网络(DeepQNetwork,DQN)的分层强化学习算法,在雅达利游戏环境中训练,使智能体能够在“环境-动作-反馈”中学习最优策略。实验表明,该方法在雅达利游戏的复杂决策中更有效,并且超过人类玩家的平均水平。
With the development of machine learning, deep reinforcement learning has become a research hotspot because of its advantages of being able to independently explore and try the errors of large-scale input, so as to learn the optimal strategy. However, the traditional reinforcement learning is faced with dimensional disaster when dealing with complex decision-making tasks, and it can’t solve the problem of sparse reward. In this paper, a hierarchical reinforcement learning algorithm, which combines the Manager-Worker hierarchical structure with the classical reinforcement learning algorithm Deep Q Network(DQN), is proposed. It is trained in Atari game environment,so that agents can learn the best strategy in "environment-action-feedback". The experiment shows that this method is more effective in complex decision-making of Atari games, and exceeds the average level of human players.
作者
周婉
姚溪子
肖雨薇
刘艳芳
ZHOU Wan;YAO Xizi;XIAO Yuwei;LIU Yanfang(Computer and Information Engineering College,Hubei University,Wuhan Hubei 430000,China)
出处
《信息与电脑》
2022年第20期97-99,共3页
Information & Computer
基金
湖北省大学生创新创业训练计划基金资助项目(项目编号:202110512065)
湖北大学大学生创新创业训练计划基金资助项目(项目编号:202110512086)。