基于图嵌入编码形态信息的非均匀多任务强化学习方法

Method for inhomogeneous multi-task reinforcement learning based on morphological information encoding by graph embedding

下载PDF

导出

摘要传统强化学习方法存在效率低下、泛化性能差、策略模型不可迁移的问题。针对此问题,提出了一种非均匀多任务强化学习方法,通过学习多个强化任务提升效率和泛化性能,将智能体形态构建为图,利用图神经网络能处理任意连接和大小的图来解决状态和动作空间维度不同的非均匀任务,突破模型不可迁移的局限,充分发挥图神经网络天然地利用图结构归纳偏差的优点,实现了模型高效训练和泛化性能提升,并可快速迁移到新任务。多任务学习实验结果表明,与以往方法相比,该方法在多任务学习和迁移学习实验中均表现出更好的性能,在迁移学习实验中展现出更准确的知识迁移。通过引入图结构偏差,使该方法具备更高的效率和更好的迁移泛化性能。 Traditional reinforcement learning methods have problems of low efficiency,poor generalization performance,and untransferable policy models.In response to this issue,this paper proposed an inhomogeneous multitask reinforcement learning method,which improved efficiency and generalization performance by learning multiple reinforcement tasks.It constructed the morphology of agent into a graph,and the graph neural network could handle graphs with any connection pattern and size graph,which was really suitable to solve inhomogeneous tasks with different dimensions of state and action space.This breaks through the limitations that model couldn’t be transferred and fully utilizes the advantages of graph neural network’s natural use of graph structure to induce bias.The model had achieved efficient training and improved generalization performance,and could be quickly migrated to new tasks.The results of multi task learning experiments show that compared with previous methods,this method exhibits better performance in both multi task learning and transfer learning experiments,and exhibits more accurate knowledge transfer in transfer learning experiments.By introducing bias in the structure of the agent graph,this method has achieved higher efficiency and better migration generalization performance.

作者贺晓王文学 He Xiao;Wang Wenxue(State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Institutes for Robotics&Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区中国科学院沈阳自动化研究所机器人学国家重点实验室中国科学院机器人与智能制造创新研究院中国科学院大学

出处《计算机应用研究》 CSCD 北大核心 2024年第4期1022-1028,共7页 Application Research of Computers

基金国家自然科学基金资助项目(U1908215) 辽宁省“兴辽英才计划”资助项目(XLYC2002014)。

关键词多任务强化学习图神经网络变分图自编码器形态信息编码迁移学习 multi-task reinforcement learning graph neural network variational graph autoencoder morphology information encoding transfer learning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1黄青.基于异构图嵌入的恶意软件检测[J].电子设计工程,2024,32(7):92-96.
2秦志龙,邓琨,刘星妍.基于元路径卷积的异构图神经网络算法[J].电信科学,2024,40(3):89-103.
3那丽娟.教研智慧形态构建:数字化背景下落实高效教研[J].辽宁教育,2024(6):45-48.
4傅彦铭,陆盛林,陈嘉元,覃华.基于深度强化学习和隐私保护的群智感知动态任务分配策略[J].信息网络安全,2024(3):449-461.
5司炳山,董志明,孙茂凡.基于强化学习的无人车组路径优化算法研究[J].计算机仿真,2024,41(2):455-461.
6王宇航.中国式现代化文明观的思想基础与理论意涵[J].当代中国与世界,2024(1):88-97.
7翁逸蓉.“素养、内容、方法、评价”四阶赋能大单元作业设计的研究[J].浙江考试,2024(3):44-49.
8胡伟强.基于模糊策略的水肥一体机控制可行性分析[J].农机化研究,2024,46(8):226-230.
9王荣辉,戚洪帅,蔡锋,尹航,刘根,赵绍华.基于海滩形态指数的海滩形态提取方法研究[J].海洋通报,2024,43(1):97-105.
10李凡,李慧斯,马文丹.基于强化学习的网络拥塞控制算法[J].科技创新与应用,2024,14(10):55-58.

计算机应用研究

2024年第4期

浏览历史

内容加载中请稍等...

基于图嵌入编码形态信息的非均匀多任务强化学习方法

相关作者

相关机构

相关主题

浏览历史