期刊文献+

基于时空图注意力网络的服务机器人动态避障

Dynamic Obstacle Avoidance for Service Robots Based on Spatio-Temporal Graph Attention Network
下载PDF
导出
摘要 为了解决服务机器人在具有自主决策能力的密集人群中容易发生碰撞、假死和路径不自然等问题,在深度强化学习的框架下提出基于时空图注意力网络的服务机器人动态避障算法。时空图注意力网络作为邻近策略优化(PPO)算法的决策函数,首先采用门控循环单元控制机器人对环境的记忆和遗忘程度,提取环境的时间特征,使其对行人运动趋势有一定的预测作用;然后采用图注意力网络获取机器人和行人在空间上的隐式交互特征,使机器人能寻找无碰撞路径;最后在PPO算法中对时空图注意力网络进行训练,使得机器人在人群中完成无碰撞导航任务。在人均2.5 m^(2)的动态封闭环境中对算法进行实验验证,结果表明,与非学习型的动态窗口算法相比,该算法导航成功率提高71个百分点,与基于学习型的DSRNN-RL算法相比,该算法导航成功率提高3个百分点同时导航路径更短。Gazebo环境下的实时导航测试结果表明,所提算法的平均推理时间为21.90 ms,可以满足实时导航的要求。 To solve the problems of collision,freezing,and the unnatural paths of service robots in dense crowds with autonomous decision-making ability,this study proposes a dynamic obstacle avoidance algorithm for service robots based on spatio-temporal graph attention network under the framework of Deep Reinforcement Learning(DRL).Spatio-temporal graph attention network represents the decision function of Proximal Policy Optimization(PPO)algorithm.First,the algorithm uses a Gated Recurrent Unit(GRU)to control the degree of memory and forgetting of the robot with respect to its environment and then extracts the time characteristics of that environment.This ensures the robot has a certain predictive effect on the movement trend of pedestrians.Second,the algorithm uses graph attention networks to obtain the spatially implicit interaction features between robots and pedestrians,enabling the robot to locate collision-free paths.Finally,the spatio-temporal graph attention network is trained under the PPO algorithm,which enables the robot to realize collision-free navigation tasks in a crowd.The algorithm is verified by simulation experiments in a dynamic closed environment of 2.5 m^(2)per capita.Compared with the non-learning Dynamic Window Algorithm(DWA),the navigation success rate of the proposed algorithm is improved by 71 percentage points.In addition,compared with the learning-type DSRNN-RL algorithm,the navigation success rate of the proposed algorithm is improved by 3 percentage points and the navigation path is shorter.Finally,a real-time navigation test in the Gazebo environment shows that the average inference time of the algorithm is 21.90 ms,which meets the requirements of real-time navigation.
作者 杜海军 余粟 DU Haijun;YU Su(School of Electronic and Electrical Engineering,Shanghai University of Engineering Science,Shanghai 201620,China)
出处 《计算机工程》 CAS CSCD 北大核心 2024年第2期105-112,共8页 Computer Engineering
基金 上海市科委科研计划项目(17511110204)。
关键词 服务机器人 动态避障 深度强化学习 时空图注意力网络 实时导航 service robot dynamic obstacle avoidance Deep Reinforcement Learning(DRL) spatio-temporal graph attention network real-time navigation
  • 相关文献

参考文献7

二级参考文献34

共引文献81

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部