摘要
为了解决传统深度强化学习在室内未知环境下移动机器人路径规划中存在探索能力差和环境状态空间奖励稀疏的问题,提出了一种基于深度图像信息的改进深度强化学习算法。利用Kinect视觉传感器直接获取的深度图像信息和目标位置信息作为网络的输入,以机器人的线速度和角速度作为下一步动作指令的输出。设计了改进的奖惩函数,提高了算法的奖励值,优化了状态空间,在一定程度上缓解了奖励稀疏的问题。仿真结果表明,改进算法提高了机器人的探索能力,优化了路径轨迹,使机器人有效地避开了障碍物,规划出更短的路径,简单环境下比DQN算法的平均路径长度缩短了21.4%,复杂环境下平均路径长度缩短了11.3%。
An improved deep reinforcement learning algorithm based on deep image information is proposed in order to solve the problem of poor exploration ability and sparse environment state space of traditional deep reinforcement learning in path planning of the mobile robot in unknown indoor environment.The depth image information and target position information directly obtained by the Kinect visual sensor are used as the input of the network.The linear velocity and angular velocity of the robot are used as the output of the next action command.An improved reward and punishment function is designed to increase the reward value of the algorithm.The state space is optimized.To a certain extent,it alleviates the problem of reward sparsity.The simulation results show that the improved algorithm can improve the exploration ability of the robot and optimize the path trajectory.The robot can effectively avoid obstacles and plan a shorter path.Compared with DQN algorithm,the average path length in simple environment is shortened by 21.4%.The average path length in complex environment is reduced by 11.3%.
作者
成怡
郝密密
CHENG Yi;HAO Mimi(School of Control Science and Engineering,Tiangong University,Tianjin 300387,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第21期256-262,共7页
Computer Engineering and Applications
基金
国家自然科学基金(61973234)
天津市自然科学基金(18JCYBJC88400,18JCYBJC88300)
天津市高等学校创新团队培养计划(TD13-5036)。
关键词
路径规划
深度图像信息
Kinect视觉传感器
深度强化学习
奖惩函数
探索能力
path planning
depth image information
Kinect visual sensor
deep reinforcement learning
reward and punishment function
exploration ability