期刊文献+

DP-Q(λ):大规模Web3D场景中Multi-agent实时路径规划算法 被引量:4

DP-Q(λ):Real-time Path Planning for Multi-agent in Large-scale Web3D Scene
下载PDF
导出
摘要 大规模场景中Multi-agent可视化路径规划算法,需要在Web3D上实现实时、稳定的碰撞避让。提出了动态概率单链收敛回溯DP-Q(λ)算法,采用方向启发约束,使用高奖赏或重惩罚训练方法,在单智能体上采用概率p(0-1随机数)调节奖罚值,决定下一步的寻路策略,同时感知下一位置是否空闲,完成行走过程的避碰行为,将单智能体的路径规划方案扩展到多智能体路径规划方案中,并进一步在Web3D上实现了这一方案。实验结果表明:该算法实现的多智能体实时路径规划具备了在Web3D上自主学习的高效性和稳定性的要求。 The path planning of multi-agent in an unknown large-scale scene needs an efficient and stable algorithm,and needs to solve multi-agent collision avoidance problem,and then completes a real-time path planning in Web3D.To solve above problems,the DP-Q(λ) algorithm is proposed;and the direction constraints,high reward or punishment weight training methods are used to adjust the values of reward or punishment by using a probability p(0-1 random number).The value from reward or punishment determines its next step path planning strategy.If the next position is free,the agent could walk to it.The above strategy is extended to multi-agent path planning,and is used in Web3D.The experiment shows that the DP-Q(λ) algorithm is efficient and stable in the Web3D real-time multi-agent path planning.
作者 闫丰亭 贾金原 Yan Fengting;Jia Jinyuan(School of Software Engineering,Shanghai 201804,China)
机构地区 同济大学
出处 《系统仿真学报》 CAS CSCD 北大核心 2019年第1期16-26,共11页 Journal of System Simulation
基金 国家自然科学基金面上项目(61272270)
关键词 WEB3D 大规模未知环境 多智能体 强化学习 动态奖赏p 路径规划 Web3D large-scale unknown environment multi-agent reinforcement learning dynamic rewards p,path planning
  • 相关文献

参考文献1

二级参考文献40

  • 1Agirrebeitia, 3., Aviles, R., de Bustos, I.F., Ajuria, C., 2005. A new APF strategy for path planning in environments with obstacles. Mech. Maeh. Theory., 40(6):645-658. Idol: 10.1016/j.meehmaeht heory.2005.01.0061.
  • 2Alexopoulos, C., Griffin, P.M., 1992. Path plmming for a. mobile robot. IEEE Trans. S'yst. Man CybeT"r,, 22(2): 318-322. [doi:10.1109/21.148404].
  • 3AI-Taharwa, I., Sheta, A., Al-Weshah, M., 2008. A mobile robot path planning using genetic algorithm in staticenvironment. J. Coztput. Sci., 4(4):341-344.
  • 4Barraquand, J., Langlois, B., Latombe, J.C., 1992. Nu- merical potential field techniques for robot path plan- ning. IEEE Trans. Syst. Man Cybern., 22(2):224-241. [doi: 10.1109/21.148426].
  • 5Cao, Q., Huang, Y., Zhou, J., 2006. An Evolutionary Artificial Potential Field Algorithm for Dynamic Path Planning of Mobile Robot. Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, p.3331-3336. [doi: 10.1109/IROS.2006.2825081.
  • 6Castiilo, 0., Trujillo, L., Melin, P., 2007. Multiple objective genetic algorithms for path-planning optimization in autonomous mobile robots. Soft Conput., 11(3):269- 279. [doi: 10.1007/s00500-006-0068-4].
  • 7I)earden, R., Friedman, N., Russell, S., 1998. Bayesian Q-Learning. Proc. National Conf. on Artificial Intelli- gence, p.761-768.
  • 8Dolgov, D., Thrun, S., Montemerlo, M., Diebet, J., 2010. Path planning for autonomous vehicles in unknown semi-structured environments. Int. J. Robot. Res., 29(5):485-501. [doi: 10.1177/0278364909359210].
  • 9Framling, K., 2007. Guiding exploration by pre-existing knowledge without modifying reward. Neur. Networks, 20(6):736-747. Idol: 10.1016/j.neunet.2007.02.0011.
  • 10Garcia, M.A., Montiel, O., Castillo, O., Sepulveda, R., Melin, P., 2009. Path planning for autonomous mobile robot navigation with ant colony optimization and fuzzy cost function evaluation. Appl. Soft Comput., 9(3):1102- 1110. [doi: 10.1016/j .asoc.2009.02.014].

共引文献4

同被引文献38

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部