摘要
针对传统算法、智能算法与强化学习算法在自动引导小车(automated guided vehicle,AGV)路径规划中收敛速度慢、学习效率低的问题,提出一种启发式强化学习算法,并针对传统Q(λ)算法,设计启发式奖励函数和启发式动作选择策略,以此强化智能体对优质行为的探索,提高算法学习效率.通过仿真对比实验,验证了基于改进Q(λ)启发式强化学习算法在探索次数、规划时间、路径长度与路径转角上都具有一定的优势.
Aiming at problems of slow convergence speed and low learning efficiency of traditional algorithm,intelligent algorithm and reinforcement learning algorithm in automated guided vehicle(AGV)path planning,a heuristic reinforcement learning algorithm was proposed.For the traditional Q(λ)algorithm,the heuristic reward function and heuristic action selection strategy were designed to strengthen the agent’s exploration of high-quality behaviors and improve the learning efficiency of the algorithm.Through the simulation and contrast experiments,the improved Q(λ)heuristic reinforcement learning algorithm has advantages in exploring times,planning time,path length and path corner.
作者
唐恒亮
唐滋芳
董晨刚
尹棋正
海秋茹
TANG Hengliang;TANG Zifang;DONG Chengang;YIN Qizheng;HAI Qiuru(School of Information,Beijing Wuzi University,Beijing 101149,China)
出处
《北京工业大学学报》
CAS
CSCD
北大核心
2021年第8期895-903,共9页
Journal of Beijing University of Technology
基金
教育部人文社科基金资助项目(20YJCZH200)
北京市教育委员会科技计划资助项目(KM202110037002)
北京市“高创计划”青年拔尖人才资助项目(2017000026833ZK25)
北京市通州区运河计划领军人才资助项目(YHLB2017038)