期刊文献+

基于互信息最大化的意图强化学习方法的研究 被引量:2

Intention based reinforcement learning by information maximization
下载PDF
导出
摘要 强化学习主要研究智能体如何根据环境作出较好的决策,其核心是学习策略。基于传统策略模型的动作选择主要依赖于状态感知、历史记忆及模型参数等,其智能体行为很难受到控制。然而,当人类智能体完成任务时,通常会根据自身的意愿或动机选择相应的行为。受人类决策机制的启发,为了让强化学习中的行为选择可控,使智能体能够根据意图选择动作,将意图变量加入到策略模型中,提出了一种基于意图控制的强化学习策略学习方法。具体地,通过意图变量与动作的互信息最大化使两者产生高相关性,使得策略能够根据给定意图变量选择相关动作,从而达到对智能体的控制。最终,通过复杂的机器人控制仿真任务Mujoco验证了所提方法能够有效地通过意图变量控制机器人的移动速度和移动角度。 Reinforcement learning studies how an agent makes decisions through the interaction with the unknown environment,its core is to learn the policy.The action selection of traditional policy model mainly depends on state perception,historical memory and model parameters,which are difficult to control.However,when human fulfill a task,they usually make decisions according to their own intention or motivation.Inspired by the human decision-making mechanism,in order to make the behavior selection mechanism controllable and enable the agent to choose the action according to the intention,this paper proposed to incorporate the intention variable to the policy model and obtain an intention motivated reinforcement learning method.More specifically,the proposed method maximized the mutual information between the intention variables and the actions,so that the policy could select the action related to the intention variable.Finally,the effectiveness of the proposed intention-motivated control was demonstrated through the complex Mujoco environment in simulated robot control task.
作者 赵婷婷 吴帅 杨梦楠 陈亚瑞 王嫄 杨巨成 Zhao Tingting;Wu Shuai;Yang Mengnan;Chen Yarui;Wang Yuan;Yang Jucheng(College of Artificial Intelligence,Tianjin University of Science&Technology,Tianjin 300457,China)
出处 《计算机应用研究》 CSCD 北大核心 2022年第11期3327-3332,3364,共7页 Application Research of Computers
基金 国家自然科学基金资助项目(61976156) 天津市企业科技特派员项目(20YDTPJC00560)。
关键词 强化学习 互信息 意图控制 近端策略优化算法 reinforcement learning(RL) mutual information intentional control proximal policy optimization
  • 相关文献

参考文献5

二级参考文献19

共引文献560

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部