摘要
移动机器人是完成救援、运输等各种任务的重要工具,如何让机器人系统自主适应不同的复杂场景是目前的研究热点。本文针对具有静态和动态障碍物的复杂未知环境,对移动机器人进行运动学建模,提出了基于长短期记忆网络的近端策略优化避障算法。在无障碍物和有障碍物的仿真训练环境中,实现无先验地图信息情况下机器人在非结构化环境中的自主避障。仿真和实验结果表明,本文所提算法能够有效使机器人避开静态及动态障碍物,性能高于D3QN算法、PPO算法,解决了深度强化学习算法在训练机器人避障时收敛速度较慢的问题。
Mobile robots are important tools to complete various tasks such as rescue and transportation.How to make the robot system adapt to different complex scenarios autonomously is a current research hotspot.In this paper,for a complex unknown environment with static and dynamic obstacles,the mobile robot is kinematically modeled,and a novel obstacle avoidance algorithm based on Proximal Policy Optimization algorithm and Long Short-Term Memory network is proposed.In the simulation training environment with obstacles and without obstacles,the robot can autonomously avoid obstacles in an unstructured environment without prior map information.Simulation and experimental results show that the proposed algorithm can effectively make the robot avoid static and dynamic obstacles,and the performance of the algorithm proposed in this paper is better than that of D3QN algorithm and PPO algorithm.It solves the problem that the deep reinforcement learning algorithm has a slow convergence speed when training robots to avoid obstacles.
作者
问泽藤
温淑慧
张迪
WEN Zeteng;WEN Shuhui;ZHANG Di(Engineering Research Center of the Ministry of Education for Intelligent Control System and Intelligent Equipment,Yanshan University,Qinhuangdao,Hebei 066004,China;Key Laboratory of Industrial Computer Control Engineering of Hebei Province,Yanshan University Qinhuangdao,Hebei 066004,China)
出处
《燕山大学学报》
CAS
北大核心
2021年第3期274-282,共9页
Journal of Yanshan University
基金
国家自然科学基金资助项目(61773333)。
关键词
移动机器人
深度强化学习
自主避障
奖励函数
长短期记忆网络
mobile robot
deep reinforcement learning
autonomous obstacle avoidance
reward function
long short-term memory network