摘要
自主移动机器人(autonomous mobile robot,AMR)路径规划是货物搬运、仓储物流等领域的一项关键技术。当工厂内的工作环境发生变化时,AMR单纯使用强化学习算法重新学习最优路径的速度慢。针对此问题,在Q学习算法的基础上提出了一种策略迁移强化学习算法。该算法使用源任务保存的相邻状态转移和目标任务保存的相邻状态转移计算相似度。根据相似度的大小和权重选择性地迁移源任务的策略,并以一定概率进行随机探索和使用目标任务新学习的策略。所提算法的有效性在AMR合作搬运任务中得到了验证。与其他方法相比,该算法的启动能力更强,收敛速度更快。
Path planning of autonomous mobile robot(AMR)is a key technology in the fields of cargo handling,warehousing and logistics.When the working environment in the factory changes,AMR is slow to relearn the optimal path using reinforcement learning algorithms alone.To sovle this problem,a strategy transfer-reinforcement learning algorithm based on Q learning algorithm is proposed.The algorithm uses the adjacent state transitions saved by the source task and the adjacent state transitions saved by the target task to calculate the similarity.According to the size of similarity and weight,the source task strategy is selectively transferred,and the target task is randomly explored and the new learning strategy is used with a certain probability.The effectiveness of the proposed algorithm is validated in the AMR cooperative handling task.Compared with other methods,the proposed algorithm has superior startup ability and convergence speed.
作者
刘明阳
张震
宋婷婷
周维庆
LIU Mingyang;ZHANG Zhen;SONG Tingting;ZHOU Weiqing(School of Automation,Qingdao University,Qingdao 266071,China;Shandong Key Laboratory of Industrial Control Technology,Qingdao 266071,China;Vehicle Maintenance Department,Third Operation Center of Qingdao Metro Operation Co.,Ltd.,Qingdao 266071,China)
出处
《控制工程》
CSCD
北大核心
2024年第7期1195-1202,共8页
Control Engineering of China
基金
国家自然科学基金资助项目(61903209)。
关键词
迁移学习
强化学习
状态转移
策略迁移
相似度
Transfer learning
reinforcement learning
state transition
strategy transfer
similarity