期刊文献+

基于深度强化学习的柑橘采摘机械臂路径规划方法 被引量:5

Path planning method for citrus picking manipulator based on deep reinforcement learning
下载PDF
导出
摘要 [目的]为解决非结构化环境下采用深度强化学习进行采摘机械臂路径规划时存在的效率低、采摘路径规划成功率不佳的问题,提出了一种非结构化环境下基于深度强化学习(Deep reinforcement learning, DRL)和人工势场的柑橘采摘机械臂的路径规划方法。[方法]首先,通过强化学习方法进行采摘路径规划问题求解,设计了结合人工势场的强化学习方法;其次,引入长短期记忆(Longshort term memory,LSTM)结构对2种DRL算法的Actor网络和Critic网络进行改进;最后,在3种不同的非结构化柑橘果树环境训练DRL算法对采摘机械臂进行路径规划。[结果]仿真对比试验表明:结合人工势场的强化学习方法有效提高了采摘机械臂路径规划的成功率;引入LSTM结构的方法可使深度确定性策略梯度(Deep deterministic policy gradient,DDPG)算法的收敛速度提升57.25%,路径规划成功率提升23.00%;使软行为评判(Soft actor critic,SAC)算法的收敛速度提升53.73%,路径规划成功率提升9.00%;与传统算法RRT-connect(Rapidly exploring random trees connect)对比,引入LSTM结构的SAC算法使规划路径长度缩短了16.20%,路径规划成功率提升了9.67%。[结论]所提出的路径规划方法在路径规划长度、路径规划成功率方面存在一定优势,可为解决采摘机器人在非结构化环境下的路径规划问题提供参考。 【Objective】In order to solve the problems of poor training efficiency and low success rate of picking path planning of manipulator using deep reinforcement learning(DRL),this study proposed a path planning method combined with DRL and artificial potential field for citrus picking manipulator in unstructured environments.【Method】Firstly,the picking path planning problem was solved by the DRL with artificial potential field method.Secondly,the longshort term memory(LSTM)structure was introduced to improve the Actor network and Critic network of two DRL algorithms.Finally,the DRL algorithms were trained in three different unstructured citrus growing environments to perform path planning for picking manipulator.【Result】The comparison of simulation experiments showed that the success rate of path planning was effectively improved by combining DRL with the artificial potential field method,the method with LSTM structure improved the convergence speed of the deep deterministic policy gradient(DDPG)algorithm by 57.25%and the success rate of path planning by 23.00%.Meanwhile,the method improved the convergence speed of the soft actor critic(SAC)algorithm by 53.73%and the path planning success rate by 9.00%.Compared with the traditional algorithm RRT-connect(Rapidly exploring random trees connect),the SAC algorithm with LSTM structure shortened the planned path length by 16.20%and improved the path planning success rate by 9.67%.【Conclusion】The proposed path planning method has certain advantages for path planning length and path planning success rate,which can provide references for solving path planning problems of picking robots in unstructured environments.
作者 熊春源 熊俊涛 杨振刚 胡文馨 XIONG Chunyuan;XIONG Juntao;YANG Zhengang;HU Wenxin(College of Mathematics and Informatics,South China Agricultural University,Guangzhou 510642,China)
出处 《华南农业大学学报》 CAS CSCD 北大核心 2023年第3期473-483,共11页 Journal of South China Agricultural University
基金 国家自然科学基金(32071912) 广州市基础研究计划(202102080337)。
关键词 采摘机械臂 柑橘 路径规划 深度强化学习 非结构化环境 LSTM Picking manipulator Citrus Path planning Deep reinforcement learning Unstructured environment LSTM
  • 相关文献

参考文献16

二级参考文献124

共引文献515

同被引文献60

引证文献5

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部