基于过滤机制筛选信息的多智能体策略方法

Research on multi-agent strategy based on filtering mechanism to filter information

导出

摘要多智能体系统在进行协作或竞争时,会面临联合信息空间扩大、智能体间信息提取效率降低的问题.对此,采用增加过滤机制来筛选信息的多智能体强化学习策略方法(FMAC),以增强智能体间信息交流能力.该方法通过找到彼此相关联的智能体,根据相关性计算智能体的信息贡献,过滤掉无关智能体信息,从而实现在合作、竞争或者混合环境下智能体间有效的沟通.与此同时,采用集中训练分散执行的方式解决环境的非平稳性问题.通过对比算法进行实验,结果表明改进算法提高了策略迭代效率以及泛化能力,并且智能体数量增多时仍可保持稳定的效果,有助于将多智能体强化学习应用到更广泛的领域. When multi-agent systems cooperate or compete,the joint information space will be enlarged and the efficiency of information extraction between agents will be reduced.In this paper,a multi-agent reinforcement learning strategy(FMAC)with filtering mechanism to filter information is adopted to enhance the ability of information communication between agents.By finding the related agents and calculating their information contribution according to the correlation,the method filters out the irrelevant agent information so as to realize the effective communication between agents in cooperative competition or mixed environment.At the same time,the centralized training decentralized execution method is adopted to solve the non-stationarity of environment.In this paper,experiments are carried out by comparing algorithms to verify that the improved algorithm improves the strategy iteration efficiency and generalization ability,and can maintain stable effects when the number of agents increases,which is conducive to the application of multi-agent reinforcement learning to a wider range of fields.

作者陈亮郭婷刘韵婷杨佳明 CHEN Liang;GUO Ting;LIU Yun-ting;YANG Jia-ming(School of Automation and Electrical Engineering,Shenyang Ligong University,Shenyang 110159,China)

机构地区沈阳理工大学自动化与电气工程学院

出处《控制与决策》 EI CSCD 北大核心 2022年第6期1643-1648,共6页 Control and Decision

关键词强化学习多智能体决策信息过滤集中训练分散执行 reinforcement learning multi-agent system filtering mechanism centralized training decentralized execution

分类号 TP273 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献3

1高阳,陈世福,陆鑫.强化学习研究综述[J].自动化学报,2004,30(1):86-100. 被引量：268
2赵志宏,高阳,骆斌,陈世福.多Agent系统中强化学习的研究现状和发展趋势[J].计算机科学,2004,31(3):23-27. 被引量：12
3张健,潘耀宗,杨海涛,孙舒,赵洪利,无.基于蒙特卡洛Q值函数的多智能体决策方法[J].控制与决策,2020,35(3):637-644. 被引量：5

二级参考文献52

1Hewitt C. Viewing Control Ctructures as Patterns of Passing Messages. Artificial Intelligence, 1977,8(3) :323-364
2Wooldridge M,Jennings N R. Agent Theories,Architectures,and Languages: a Survey. In: Wooldridge, Jennings, eds. Intelligent Agents,Berlin: Springer-Verlag, 1995. 1-22.
3Wei β G. Learning to Coordinate Actions in Multi-Agent Systems Proceedings of IJCAI'93, 1993
4Dworman,Garett,Kimbrough S,Laing J. Bargaining by Artificial Agents in Two Coalition Games: A Study in Genetic Programming for Electronic Commerce. In: Proc. of the AAAI Genetic Programming Conf. Stanford,CA,Aug. 1996
5Kaelbling L P. Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research, 1996,4: 237-285
6Singh S. Agents and Reinforcement Learning. Miller freeman publish Inc,San Mateo,CA,USA,1997
7Bellman R. Dynamic Programming. Prentice-Hall, Englewood Cliffs, NJ, 1957
8Sutton R S. Learning to predict by the methods of temporal differences. Machine Learning, 1988,3: 9 - 44
9Sutton R S. Convergence theory for a new kind of prediction learning. In:Proc. of the 1988 Workshop on Computational Learning Theory, 1988. 421-442
10Watkins C J C H,Dayan P. Q-Learning. Machine Learning,8(3):279-292

共引文献277

1项宇,秦进,袁琳琳.结合向前状态预测和隐空间约束的强化学习表示算法[J].计算机系统应用,2022,31(11):148-156. 被引量：4
2安萌萌,樊秀梅,蔡含宇.基于雾计算和强化学习的交通灯智能协同控制研究[J].计算机应用研究,2020,37(2):465-469. 被引量：8
3丁志梁,潘毅群(指导),谢建彤,王尉同,黄治钟.强化学习算法在空调系统运行优化中的应用研究[J].建筑节能,2020(7):14-20. 被引量：7
4王彦朋,郭佳佳,王晓君.基于Q-Learning的青霉素发酵过程控制方法[J].信息化研究,2023,49(3):31-35.
5马庆刘,喻鹏,吴佳慧,熊翱,颜拥.基于深度强化学习的综合能源业务通道优化机制[J].北京邮电大学学报,2020,43(2):87-93. 被引量：1
6赵元,张合新.基于目标状态距离简化Q-learning算法的迷宫路径规划[J].火箭军工程大学学报,2019(4):79-84.
7周济,陈锋.基于强化神经网络的区域协调控制研究[J].电子技术（上海）,2010(9):20-22.
8卓睿,陈宗海,陈春林.基于强化学习和模糊逻辑的移动机器人导航[J].计算机仿真,2005,22(8):157-162. 被引量：5
9魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量：19
10沈晶,顾国昌,刘海波.分层强化学习研究综述[J].模式识别与人工智能,2005,18(5):574-581. 被引量：7

1张立晨.浓缩的未必是精华[J].新读写,2022(4):44-45.
2陈志坚.数控加工技术在模具制造中的应用[J].河北农机,2021(15):83-84.
3丁希成,王红军.基于布谷鸟迭代更新策略的多重优化算法[J].计算机工程与应用,2022,58(9):67-73. 被引量：5
4麻友东.全面运用信息技术,提升小学数学教学[J].当代家庭教育,2021(11):136-137.
5徐旭,王涛.构建教科研共同体,助力教师成长[J].小学教学研究,2022(17):32-34.
6刘岩,韩璐,李娜.联邦学习和证据理论在智慧城市网络安全态势感知中的应用研究[J].电脑知识与技术,2022,18(15):22-24. 被引量：4
7陈媛媛.应用型本科院校大学生数学素质教育研究[J].科教导刊,2021(35):65-67. 被引量：2
8张兆华.初中化学方程式教学的几点体会[J].数理化解题研究,2022(17):125-127.
9方维维,王云鹏,张昊,孟娜.基于多智能体深度强化学习的车联网通信资源分配优化[J].北京交通大学学报,2022,46(2):64-72. 被引量：11
10顾福敢.“物理综合实践活动”命题分析[J].科学大众（科学中考）,2021(10):56-58.

控制与决策

2022年第6期

浏览历史

内容加载中请稍等...

基于过滤机制筛选信息的多智能体策略方法

参考文献3

二级参考文献52

共引文献277

相关作者

相关机构

相关主题

浏览历史