期刊文献+

基于表格记忆式清扫机器人路径规划方法

Path Planning Method Based on Tabular Memory Cleaning Robot
下载PDF
导出
摘要 针对基于值函数分解框架下的深度强化学习算法在训练过程中由于Q值估计不准确而导致算法收敛速度慢和数据利用率低的问题,该文在QMIX算法基础上提出了基于表格记忆下的值函数分解算法QMIX_TM。引入用于记录具有高价值的状态-动作对的表格,引导算法快速收敛到最优策略,同时将表格中的数据存放到回放经验池中,提高高价值数据的利用率。为了验证算法有效性,构建了清扫机器人路径规划仿真平台,并专门设置了回报函数,以此进行自主探索最优路径的实验。实验结果表明,QMIX_TM算法收敛速度得到提升,清扫机器人完成任务花费的时间相比于改进前的QMIX算法缩短了12.9倍。 Aiming at the problem that the deep reinforcement learning algorithm under the framework of value function decomposition is slow convergence speed and low data utilization due to inaccurate Q value estimation during the training process,a value function decomposition algorithm based on table memory is proposed QMIX_TM based on the QMIX algorithm.A table for recording high-value state-action pairs is introduced to guide the algorithm to quickly converge to the optimal strategy,and the data in the table is stored in the playback experience pool to improve the utilization rate of high-value data.To verify the effectiveness of the algorithm,a cleaning robot path planning simulation platform is constructed,and a return function is specially set to carry out experiments to independently explore the optimal path.The experimental results show that QMIX_TM the convergence speed of the algorithm is improved,and the time spent by the cleaning robot to complete the task is shortened by 12.9 times compared with the QMIX algorithm before the improvement.
作者 周维庆 王飞 崔丹 李琛 ZHOU Weiqing;WANG Fei;CUI Dan;LI Chen(College of Automation,Qingdao University,Qingdao 266071,China;Shandong Provincial Key Laboratory of Industrial Control,Qingdao 266071,China;Shandong Weifang Tobacco Co.,Ltd.,Weifang 262400,China;College of Ship Electrical Engineering,Dalian Maritime University,Dalian 116026,China)
出处 《自动化与仪表》 2023年第10期37-41,共5页 Automation & Instrumentation
基金 国家自然科学基金项目(61903209)。
关键词 强化学习 深度强化学习 表格记忆 路径规划 值函数分解 reinforcement learning deep reinforcement learning tabular memory path planning value function decomposition
  • 相关文献

参考文献12

二级参考文献116

共引文献181

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部