摘要
提出了一种基于强化学习的机器人路径规划算法,该算法将激光雷达所获取的移动机器人周围障碍物信息与目标点所在方位信息离散成有限个状态,进而合理地设计环境模型与状态空间数目;设计了一种连续的报酬函数,使得机器人采取的每一个动作都能获得相应的报酬,提高了算法训练效率.最后在Gazebo中建立仿真环境,对该智能体进行学习训练,训练结果验证了算法的有效性;同时在实际机器人上进行导航实验,实验结果表明该算法在实际环境中也能够完成导航任务.
A path planning algorithm for mobile robot was studied based on reinforcement learning.The algorithm discretized the obstacle information around the mobile robot acquired by the LIDAR(laser intensity direction and ranging)and the position information of the target point into finite state,and then rationally designed the number of the environmental model and state spaces.In addition,a continuous reward function was studied,which made each action taken by the robot get corresponding reward and improved the efficiency of algorithm training.Finally,a simulation environment was established in Gazebo to learn and train the agent.The training results verify the effectiveness of the algorithm.Simultaneously,a navigation experiment was conducted on an actual robot.The results show that the algorithm can also complete the navigation task in the actual environment.
作者
张福海
李宁
袁儒鹏
付宜利
Zhang Fuhai;Li Ning;Yuan Rupeng;Fu Yili(The State Key Laboratory of Robotics and System,Harbin Institute of Technology,Harbin 150001,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2018年第12期65-70,共6页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
黑龙江省自然科学基金资助项目(LC2017022)
关键词
移动机器人
强化学习
路径规划
连续报酬函数
导航实验
mobile robot
reinforcement learning
path planning
continuous reward function
navigation experiment