基于影响图的多智能体学习算法被引量：1

Learning algorithm of multi-agents system based on influence diagrams

下载PDF

导出

摘要提出一种多智能体学习算法.用影响图作为 agent 表示工具,给定 agent 的一个初始模型和它的历史行为,在能力、信念和优先学习的基础上来构建新的模型.学习方法是把其它 agent 的历史行为作为训练集,利用神经网络以及决策知识和专家知识来修改影响图中各结点的连接关系.针对与 agent 历史行为不一致的情况,本文把它看成效用函数发生了随机偏差,通过 Markov chain-Monte Carlo 技术进行模拟,实现效用函数的调整.最后利用多机编队协同空战作为例子说明算法的实用性. This paper proposes a learning algorithm of multi-agents system. The influence diagram is used asa modeling representation tool. Given an initial model of an Agent and its history behavior based on capabilities, preferences and beliefs of the Agent, a new model is constructed. Using observed behavior history of other Agent as training set, the learning method is to modify the connection relation between the nodes in the diagram by using neural network, decision knowledge and expert knowlege approach. The inconsistent behavior is interpreted as random deviations from an underlying utility function in this paper, and the utility function is modified by the Markov chain-Monte Carlo technique. An example of team cooperative air combat shows that this algorithm is effective.

作者钟麟陈丽娟佟明安张圣云

机构地区西北工业大学电子信息学院第二炮兵工程学院附属中学

出处《系统工程学报》 CSCD 北大核心 2008年第3期377-380,共4页 Journal of Systems Engineering

基金空军国防预研课题资助项目(4020501)

关键词多智能体影响图学习神经网络 multi-agent systems influence diagram learning neural network

分类号 TP273.22 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献8

1Sahin F, Bay J S. Learning from experience using a decision-theoretic intelligent Agent in multi-Agent systems [ A ]. In: Mountain Workshop on Soft Computing in Industrial Applications[ C]. San Diego: IEEE, 2001. 109-114.
2张润梅,王浩.一种基于影响图学习其他Agent模型方法[J].辽宁工程技术大学学报（自然科学版）,2005,24(4):577-579. 被引量：2
3Carmel D, Markovitch S. Learning models of intelligent Agents [ J ]. International Journal of Expert Systems, 1996, 14 (1): 62-67.
4Chajewska U, Koller D, Ormoneit D. Learning an Agent's utility function by observing behavior[ A]. In: Proceedings of the Eighteenth International Conference on Machine Learning[ C ]. Kanagawa, Japan: IEEE, 2001. 35-42.
5Nielsen T D, Jensen F V. Sensitivity analysis in influence diagrams [ J ]. IEEE Trans. on Systems, Man and Cybernetics, Part A, Systems and Humans, 2003, 33 (2) : 223-234.
6Applegate D, Kannan R. Sampling and integration of near log-concave functions[ A]. In: Proceedings of the 23rd Annual ACM Symposium on Theory and Computing[ C]. New Orleans: ACM Press, 1991. 156-163.
7刘金星,佟明安.双机编队协同战术的实现[J].系统工程与电子技术,2003,25(5):540-542. 被引量：11
8董彦非,申洋,张恒喜.空战机动决策中的影响图方法[J].电光与控制,2001,8(1):49-53. 被引量：16

二级参考文献15

1于睿箭,冯允成.影响图的基础理论和发展[J].北京航空航天大学学报,1994,20(4):429-435. 被引量：11
2詹原瑞,何娟.树与影响图[J].系统工程理论与实践,1997,17(4):1-8. 被引量：8
3[4]Goodrich K. H. ,McManus J.W. Development of a tactical guidance research and evaluation system (TIGRES). AIAA Paper 89～3312,Aug. 1989.
4[5]Katz A. Tree lookahead in air combat. Journal of Aircraft, 1994,31 (4): 970～973.
5[6]汪应洛.系统工程(第二版)[M].北京:机械工业出版社,1999.
6[7]Kai Virtanen,Tuomas Raivio,Raimo P. Decision theoretical approach to pilot simulation. Journal of Aircraft, 1999,36(4) : 632～ 641.
7Sen S. Evolution and learning in multi-agent systems[J]. International Journal of Human-Computer Studies, 1998,48(1):1-7.
8Noh Sand Gmytrasiewicz P J. Implementation and evaluation of rational communicative behavior in coordinated defense[A].Oren Etzioni,J(o)rg P.Müller and Jeffrey M.Bradshaw. Proceedings of the third annual conference on Autonomous Agents [C]. New York, USA:ACM Press, 1999.123-130.
9Noh S and Gmytrasiewicz P J. Agent modeling in antiair defense[A]. A.Jameson C.Paris and C. Tasso. Proceedings of the Sixth International Conference on User Modeling[C]. Sardinia, Italy: Springer-Verlag Telos, 1997.389-400.
10Howard R A. and Matheson J E. Readings on the principles and applications of decision analysis(Vol 2) [M]. CA: Strategic Decisions Group, Menlo Park. 1984.719-762.

共引文献26

1赵星辰,吴军,彭芳,吴华.联合空战中一种基于双机配合的无源定位方法研究[J].传感器与微系统,2012,31(6):18-21. 被引量：10
2钟麟,佟明安,钟卫,张圣云.影响图决策方法在编队协同空战中的应用[J].飞行力学,2006,24(3):85-88. 被引量：1
3钟麟,佟明安,钟卫,张圣云.多级影响图在空战机动决策中的应用[J].系统工程理论与实践,2006,26(10):137-140. 被引量：6
4钟麟,佟明安,钟卫,张圣云.基于多级影响图的空战连续机动决策[J].系统仿真学报,2007,19(2):410-411. 被引量：4
5钟麟,佟明安,钟卫,张圣云.基于影响图的空战机动决策模型[J].系统仿真学报,2007,19(8):1796-1798. 被引量：7
6Zhong Lin,Tong Ming'an,Zhong Wei,Zhang Shengyun.Sequential maneuvering decisions based on multi-stage influence diagram in air combat[J].Journal of Systems Engineering and Electronics,2007,18(3):551-555. 被引量：6
7于梅祥,李一波.无人战斗机空战对策研究综述[J].沈阳航空工业学院学报,2009,26(1):23-25. 被引量：3
8钟卫,丁雄,彭向东.基于影响图对策的空战机动决策[J].系统仿真学报,2009,21(6):1522-1525.
9万伟,姜长生,吴庆宪.单步预测影响图法在空战机动决策中的应用[J].电光与控制,2009,16(7):13-16. 被引量：11
10孙永芹,孙涛,范洪达,卢建孝.现代空战机动决策研究[J].海军航空工程学院学报,2009,24(5):573-577. 被引量：2

同被引文献18

1李志强,胡晓峰,张斌,董忠林.基于强化学习的指挥控制Agent适应性仿真研究[J].系统仿真学报,2005,17(11):2801-2804. 被引量：8
2胡飞,徐浩军,曹登高.遗传算法在产生式规则获取中的应用[J].电光与控制,2006,13(3):87-90. 被引量：5
3唐亮贵,刘波,唐灿,程代杰.基于神经网络的Agent增强学习模型[J].计算机科学,2007,34(11):156-158. 被引量：3
4李东华,江驹,姜长生.多智能体强化学习飞行路径规划算法[J].电光与控制,2009,16(10):10-14. 被引量：8
5肖正,张世永.基于神经网络的Agent个性化行为选择[J].计算机工程,2009,35(24):199-201. 被引量：3
6韩月敏,林燕,刘非平,吴淑娟.陆战Agent学习机理模型研究[J].指挥控制与仿真,2010,32(1):13-17. 被引量：4
7王步云,张国.一种适用于人工生命作战仿真的混合Agent结构[J].系统仿真学报,2010,22(11):2515-2518. 被引量：7
8熊辉,赵英凯,丁瑶君.基于神经网络的遗传算法优化及其应用[J].南京化工大学学报,2000,22(4):21-24. 被引量：6
9闫雪飞,李新明,刘东,李亢.基于多分辨率的multi-Agent武器装备体系作战仿真研究[J].系统仿真学报,2017,29(1):136-143. 被引量：4
10朱丰,胡晓峰,吴琳,贺筱媛,杨璐.基于深度学习的战场态势高级理解模拟方法[J].火力与指挥控制,2018,43(8):25-30. 被引量：27

引证文献1

1王步云,刘聚.作战Agent的学习算法研究进展与发展趋势[J].兵工自动化,2023,42(9):74-78.

1杨灵,周正达,张蕴玉.基于USB和LabVIEW开发平台的虚拟仪器的设计[J].计算机与数字工程,2007,35(3):172-173. 被引量：2
2刘明,张安,张耀中.多机编队指控决策一体化系统概念模型研究[J].火力与指挥控制,2010,35(10):51-54.
3宋骁健,杨根源,浦鹏.多机型协同空战多目标分配战术决策仿真算法[J].海军航空工程学院学报,2008,23(2):189-193. 被引量：6
4Chengwei Ruan,Zhongliang Zhou,Hongqiang Liu,Haiyan Yang.Task assignment under constraint of timing sequential for cooperative air combat[J].Journal of Systems Engineering and Electronics,2016,27(4):836-844. 被引量：6
5安超,李战武,常一哲,杨海燕,刘小军.关于协同空战目标分配效能优化策略仿真[J].传感器与微系统,2016,35(11):40-43. 被引量：1
6张润梅,王浩.一种基于影响图学习其他Agent模型方法[J].辽宁工程技术大学学报（自然科学版）,2005,24(4):577-579. 被引量：2
7黄大羽,倪忠建,徐飞.基于多机编队有源组网探测的多目标跟踪算法[J].探测与控制学报,2015,37(4):46-52. 被引量：1
8刘金星,佟明安.协同空战中的通信策略[J].西北工业大学学报,2003,21(5):569-573. 被引量：2
9Y.L. Zhao,Q. Yu,C.G. Zhao.Distribution Network Reactive Power Optimization Based on Ant Colony Optimization and Differential Evolution Algorithm[J].Journal of Energy and Power Engineering,2011,5(6):548-553. 被引量：1
10张云贵,佟为明,赵永丽.CUSUM异常检测算法改进及在工控系统入侵检测中的应用[J].冶金自动化,2014,38(5):1-5. 被引量：8

系统工程学报

2008年第3期

浏览历史

内容加载中请稍等...

基于影响图的多智能体学习算法被引量：1

参考文献8

二级参考文献15

共引文献26

同被引文献18

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于影响图的多智能体学习算法 被引量：1

参考文献8

二级参考文献15

共引文献26

同被引文献18

引证文献1

相关作者

相关机构

相关主题

浏览历史

基于影响图的多智能体学习算法被引量：1