期刊文献+

基于多Agent强化学习的战时备件供应保障动态协调机制 被引量:2

Research on Multi Agent Reinforcement Learning Based Dynamic Coordination Mechanism for Wartime Spares Support
下载PDF
导出
摘要 有效的备件供应保障是保证航空装备处于良好状态的重要因素。战时备件供应保障的不确定性特点突出。为了应对这些不确定,精确保障要求下战时备件供应保障更加强调备件资源在系统中的动态协调。鉴于战时备件供应保障系统与多Agent系统的相似性,采用基于A-gent的建模仿真技术研究多阶段供应保障过程中的动态协调机制。以Agent之间的供需关系为基础给出了多Agent系统模型结构中小组的定义。为了给出备件短缺情况下使军事效益最大的备件分配策略,设计出以小组为单位的多Agent强化学习方法。最后通过仿真实例验证了方法的有效性。 Spare parts support plays an import role during wartime. In order to meet the requirements of Precision Support, spare parts support must be planned deliberatively prewar and be executed flexibly to deal with various uncertainties. Based on the similarity between the wartime spares support system and the multi agent system, Agent based modeling and simulation methods are adopted to investigate the dynamic coordinate mechanism during the wartime. Groups in the multi - agent system's structure are described on the bases of the relationship between the Agents. To decide how to supply spares dynamically during the wartime, the new multi agent reinforcement learning method is designed and presented. A simulation example is illustrated in the end and the simulation result shows that the method is effective.
出处 《空军工程大学学报(自然科学版)》 CSCD 北大核心 2009年第3期59-63,共5页 Journal of Air Force Engineering University(Natural Science Edition)
关键词 战时备件供应保障系统 动态协调机制 多AGENT系统 wartime spares support system dynamic coordination mechanism multi agent system reinforcement learning
  • 相关文献

参考文献11

  • 1Alfredsson P.Flexible Supply:The Next Step in the Evolution of Sparing Strategies[C]//SOLE 2000 35th Annual Proceedings,[S.l]:SOLE,2000.
  • 2Lawson E,Ferris T,Cropley D,et al.Development of A Foundation for Military Network Science[R/OL].[2009-4-2].http://arrow.unisa.edu.au:8081/1959.8/47987.
  • 3Kshanti Greene,David Cooper G,Michael Czajkowski,et al.A Cognitive Agent Architecture Optimized for Adaptivity[C]//DAMAS LNAI3890.Heidelberg:Spring Berlin,2006:104-120.
  • 4Gutknecht J O,Michel F.From Agents to Organizations:An Organizational View of Multi-agent Systems[C]//AOSE Australia:AasE Melbourne,2003:214-230.
  • 5王进发,李励,李仕明.军事供应链的结构柔化[J].军事运筹与系统工程,2005,19(1):23-28. 被引量:9
  • 6郑淑丽,韩江洪,骆祥峰,蒋建文.基于强化学习的多Agent协作研究[J].小型微型计算机系统,2003,24(11):1986-1988. 被引量:5
  • 7Sutton R S,Barto A G..Reinforcement Learning[M].MA:MIT Press,1997.
  • 8仲宇,顾国昌,张汝波.多智能体系统中的分布式强化学习研究现状[J].控制理论与应用,2003,20(3):317-322. 被引量:12
  • 9Tan Ming.Multi-agent Reinforcement Learning:Independent vs Cooperative Agent[C]// In Proceedings of the 10th International Conference on Machine Learning (ICML-93),San Fransisco:Morgan Kaufmann Publisher Inc,1993:487-494.
  • 10蔡庆生,张波.一种基于Agent团队的强化学习模型与应用研究[J].计算机研究与发展,2000,37(9):1087-1093. 被引量:31

二级参考文献75

  • 1SUTTON R. Learning to predict by the methods of temporal difference [J]. Machine Learning, 1988,3( 1 ) :9 - 44.
  • 2RIBEIRO C. Embedding a priori knowledge in reinforcement learning [ J]. J of Intelligent and Robotic Systems, 1998,21 ( 1 ) :51 - 71.
  • 3OH C, NAKASHIMA T, ISHIBUCHI H. Initialization of Q -values by fuzzy rules for accelerating Q -learning [A]. Proc of IEEE Int Conf on Neural Networks [ C ]. Piscataway, NJ: IEEE Press,1998:2051 - 2056.
  • 4ISHIBUCHI H, NAKASHIMA T, MIYAMOTO H. Fuzzy Q-learning for a multi-player non-cooperative repeated game[ A]. Proc of IEEE Int Conf on Fuzzy Systems [ C]. Piscataway,NJ: IEEE Press, 1997:1573 - 1579.
  • 5SUN R, PETERSON T. Multiagent reinforcement learning: weighting and partitioning [J]. Neural Networks, 1999, 12(4) :727 - 753.
  • 6TAKAHASHI Y, ASADA M, HOSODA K. Reasonable performance in less learning time by real robot based on incremental state space segmentation [ A ]. Proc of IEEE/ RSJ Int Conf on Intelligent Robots and Systems [C]. Piscataway, NJ:IEEE Press, 1996:1518.
  • 7HOUGEN D F, GINI M, SLAGLE J. Partitioning input space for reinforcement learning for control [ A ]. Proc of IEEE Int Conf on Neural Networks [C]. Piscataway, NJ: IEEE Press, 1997:755-760.
  • 8FINTON D J, HU Y. An application of importance-based feature extraction in reinforcement learning [ A ]. Proc of the 4th IEEE Workshop on Neural Networks for Signal Processing [ C]. Piscataway,NJ:IEEE Press, 1994:52 - 60.
  • 9ASADA M, NODA S, HOSODA K. Action-based sensor space segmentation for soccer robot learning [J]. Applied Artificial Intelligente, 1998, 12(2/3) : 149 - 164.
  • 10SUN R, PETERSON T. Partitioning in reinforcement learning[ A]. Proc of Int Joint Conf on Neural Networks [ C]. Piscataway,NJ:IEEE Press, 1999:1133- 1138.

共引文献53

同被引文献27

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部