期刊文献+

一种基于意图跟踪和强化学习的agent模型 被引量:3

Intention Tracking Based Reinforcement Learning Agent Model
下载PDF
导出
摘要 针对动态对抗的多agent系统(MAS)环境中agent行为前摄性较差的问题,提出了一种将意图跟踪和强化学习相结合的agent模型.该模型将对手信息和环境信息分开处理,在agent的BDI心智模型中引入了Q-学习机制应对环境变化;在强化学习的基础上注重对对手和对手团队的意图跟踪,改进Tambe的意图跟踪理论,针对特定对抗环境中的对手行为建立对手模型,跟踪对手和对手团队的意图,预测对手目标,以调整自身行为.实验证明,所提出的agent模型具有更强的自主性和适应性,在动态对抗系统中具有更强的生存能力. A reinforcement learning agent model with intention tracking has been proposed to overcome the lagging in action in dynamic confrontation multi-agent systems (MAS) environment. The information of opponents and that of the environment have been treated differently. Based on reinforcement learning, the paper pays more attention on the intention tracking of the opponents. The intention tracking theory of Tambe have been improved, and opponent models and group-opponent models have been set up to track opponent intentions for forecasting the opponent's targets and revising agent-self's actions. Simulations have provided experimental results proving that agents with this model are more autonomic and adaptive.
作者 续爽 贾云得
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2004年第8期679-682,共4页 Transactions of Beijing Institute of Technology
基金 国家"八六三"计划项目(2002AA735051)
关键词 多智能体系统 意图跟踪 Q-学习 BDI模型 multi-agent system intention tracking Q-learning BDI model
  • 相关文献

参考文献9

  • 1[1]Mitchell M. Machine learning[M]. New York: The McGr-aw Hill Companies Inc. , 1997.
  • 2[2]Schut M C, Wooldridge M. Intention reconsideration in complex environments [A]. Proceedings of the Fourth International Conference on Autonomous Agents[C]. Barcelona: [s. n.],2000. 209-216.
  • 3[3]Tessier C, Chaudron L. Conflicting agents-conflict management in multi-agent systems[M]. Dordrecht:Kluwer Academic Publishers,2001.
  • 4[4]Sutton S, Barto G. Reinforcement learning [M].Cambridge: MIT Press, 1998.
  • 5[5]Tambe M. Tracking dynamic team activity[EB/OL].http: // teamcore. usc. edu/papers/96/AT/aaai96team-final. ps, 1996-08-06/2003-07-10.
  • 6[6]Tambe M. Building agent teams using an explicit teamwork model and learning [J]. Artificial Intelligence, 1999, 110: 215-240.
  • 7[8]Schut M C, Wooldridge M. Principles of intention recon-sideration [Z]. The Fifth International Conference on Autonomous Agents, Montreal, 2001.
  • 8[9]Jungh H, Tambe M. Conflicts in agent teams[M].Dordrecht: Kluwer Academic Publishers, 2000.
  • 9[10]Pynadath D, Scerri P, Tambe M. MDPs for adjustable autonomy in a real-world multi-agent environment [EB/OL]. http: //teamcore. usc. edu/papers/2001/springSympol. ps, 2001-06-06/2003-07-10.

同被引文献25

  • 1李瑞.强化学习主要算法的研究[J].渝西学院学报(自然科学版),2004,3(3):22-25. 被引量:1
  • 2刘新宇,洪炳鎔.基于BDI框架的多Agent动态协作模型与应用研究[J].计算机研究与发展,2002,39(7):797-801. 被引量:4
  • 3EINSTein: An Artificial-Life Laboratory for Exploring Self-Organized Emergence in Land Combat[R]. Ilachinski, A. CNA Research Memorandum CRM D239, 2000.
  • 4Exploring Self-Organized Emergence in an Agent-Based Synthetic Warfare Lab. Dr. Andy Ilachinski [EB/OL]. http://www.cna.org.
  • 5Towards a Science of Experimental Complexity:An Artificial-Life Approach to Modeling Warfare. Andy Ilachinski [EB/OL].http://www.cna.org.
  • 6Irreducible Semi-Autonomous Adaptive Combat (ISAAC): An Artificial-Life Approach to Land Warfare [R]. Ilachinski, A. Center for Naval Analyses Research Memorandum CRM, 1997,97-61.
  • 7Operational Synthesis Applied to Mutual NZAJS Questions Part Ⅰ,Marine Corp Combat Development Command[Z].
  • 8Enhanced ISAAC Neural Simulation Toolkit (EINSTein), User's Guide [R]. Ilachinski, A,CNA, CIM 610.10,1999.
  • 9The Science of Complexity for Military Operations Research, W. O.Hedgepeth[J]. Phalanx, 26(1): 1993.
  • 10肖德贵,彭李翔,等.混合VANET环境下一种改进的GPSR路由算法[J].软件学报,2012,23(1):100-107.

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部