期刊文献+

基于递阶强化学习的自主机器人路径规划智能体 被引量:5

Autonomous robots path planning agent based on hierarchical reinforcement learning
下载PDF
导出
摘要 递阶强化学习是解决状态空间庞大的复杂系统智能体决策的有效方法。通过引入启发式算法思想,对一种递阶强化学习方法进行改进,使得智能体在学习过程中融入了历史信息,提高了学习效率,解决了在庞大状态空间和动态变化环境中对智能体进行最优行为策略学习的问题。以扩展的信念、愿望和意图意识模型为基础,提出了一种具有主动性、自治性、反应性、社会性的自主机器人路径规划智能体体系结构,通过仿真实验,证明了路径规划智能体的可行性和有效性。 Hierarchical reinforcement learning was an effective method to solve decision problems for complex system agent with enormous number of states. By introducing heuristic algorithm, a hierarchical reinforcement learning method was improved, making the agent obtain historical information in the learning process to increase the learning efficiency so as to solve the optimal strategy of agent learning problem in large-scale state space and dynamic environment. Based on expanded Belief Desire Intention (BDI) model, the architecture of autonomous robot path planning agent was presented with properties as initiative, autonomy, reactivity and sociability. It was proved by simulation that the path planning agent was feasible and effective.
出处 《计算机集成制造系统》 EI CSCD 北大核心 2009年第6期1215-1221,共7页 Computer Integrated Manufacturing Systems
基金 国家973计划资助项目(2007CB714701)~~
关键词 智能体 强化学习 意识模型 路径规划 agent reinforcement learning consciousness model path planning
  • 相关文献

参考文献13

  • 1杜春侠,高云,张文.多智能体系统中具有先验知识的Q学习算法[J].清华大学学报(自然科学版),2005,45(7):981-984. 被引量:21
  • 2SUTTON R S, BARTO A G. Reinforcement learning:an introduction[M]. Cambridge, Mass., USA:MIT Press, 1998.
  • 3BRATMAN M E. Intentions, plans, and practical reason[M]. Cambridge, Mass., USA:Harvard University Press, 1987.
  • 4陈卫东,席裕庚,顾冬雷.自主机器人的强化学习研究进展[J].机器人,2001,23(4):379-384. 被引量:16
  • 5DIETTERICH T. The MAXQ method for hierarchical reinforcement learning[C]//Proceedings of the 15th ICML. San Francisco, Cal. , USA : Morgan Kaufmann, 1998 :118-126.
  • 6SPIROS K, DANIEL K. Reinforcement learning of coordination in cooperative MAS[C]//Proceedings of the 8th National Conference on AI. Alberta, Canada:ACM Press, 2002:326-331.
  • 7胡山立,石纯一.理性Agent的意图维护模型[J].计算机研究与发展,2001,38(9):1046-1050. 被引量:6
  • 8TESSIER C, CHAUDRON L. Confilicting Agents-conflict management in multi Agent systems[M]. Dordrecht, Netherlands: Kluwer Academic Publishers, 2001.
  • 9胡山立,fzu.edu.cn,石纯一.Agent的意图模型[J].软件学报,2000,11(7):965-970. 被引量:24
  • 10RAO A S, GEORGEFF M P. BDI Agents:from theory to practice[C]//Proeeedings of the 1st International Conference on Multi-Agent Systems. New York, N. Y. , USA: ACM Press, 1995:312-319.

二级参考文献18

共引文献84

同被引文献63

  • 1沈晶,顾国昌,刘海波.未知动态环境中基于分层强化学习的移动机器人路径规划[J].机器人,2006,28(5):544-547. 被引量:15
  • 2LaValle S M. Planning algorithms[M]. 2nd ed. New York, NY, USA: Cambridge University Press, 2006.
  • 3Tisdale J, Kim Z, Hedrick J. Autonomous UAV path planning and estimation[J]. IEEE Robotics and Automation Magazine, 2009, 16(2): 35-42.
  • 4Fahimi E Autonomous robots modeling, path planning, and control[M]. Boston, USA: Springer Science+Business Media, LLC, 2009.
  • 5Kuwata Y, How J. Three dimensional receding horizon control for UAVs[C]//AIAA Guidance, Navigation, and Control Conference. Reston, VA, USA: AIAA, 2004:2100-2113.
  • 6Earl M G, D'Andrea R. Iterative MILP methods for vehicle con- trol problems[J]. IEEE Transactions on Robotics, 2005, 21(6): 1158-1167.
  • 7Chen Y, Han J D. LP-based path planning for target pursuit and obstacle avoidance in 3D relative coordinates[C]//American Control Conference. Piseataway, NJ, USA: IEEE, 2010: 5394- 5399.
  • 8Goerzen C, Kong Z, Mettler B. A survey of motion planning al- gorithms from the perspective of autonomous UAV guidance[J]. Journal of Intelligent and Robotic Systems, 2010, 57(1-4): 65- 100.
  • 9Vasudevan C, Ganesan K. Case-based path planning for autonomous underwater vehicles[J]. Autonomous Robots, 1996, 3(2/3): 79-89.
  • 10Kruusmaa M. Global level path planning for mobile robots in dynamic environments[J]. Journal of Intelligent and Robotic Systems, 2003, 38(1): 55-83.

引证文献5

二级引证文献35

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部