期刊文献+

基于PPO算法的自动驾驶人机交互式强化学习方法

Human-machine interactive reinforcement learning method for autonomous driving based on PPO algorithm
下载PDF
导出
摘要 针对当前自动驾驶领域中深度强化学习(deep reinforcement learning,DRL)所面临的高计算性能需求和收敛速度慢的问题,将变分自编码器(variational autoencoder,VAE)和近端策略优化算法(proximal policy optimization,PPO)相结合。通过采用VAE的特征编码技术,将Carla模拟器获取的语义图像有效转换为状态输入,以此应对DRL在处理复杂自动驾驶任务时的高计算负担。为了解决DRL训练中出现的局部最优和收敛速度慢的问题,引入了驾驶干预机制和基于驾驶员引导的经验回放机制,在训练初期和模型陷入局部最优时进行驾驶干预,以提升模型的学习效率和泛化能力。通过在交通路口左转场景进行的实验验证,结果表明,在驾驶干预机制的帮助下,训练初期模型的性能提升加快,且模型陷入局部最优时通过驾驶干预,模型的性能进一步提升,且在复杂场景下提升更为明显。 To address the high computational demands and slow convergence faced by DRL in the field of autonomous driving,this paper integrated VAE with PPO algorithm.By adopting VAE s feature encoding technology,it effectively transformed semantic images obtained from the Carla simulator into state inputs,thus tackling the high computational load of DRL in handling complex autonomous driving tasks.To solve the issues of local optima and slow convergence in DRL training,it introduced a driving intervention mechanism and a driver-guided experience replay mechanism.These mechanisms applied driving interventions during the initial training phase and when the model encounters local optima,so as to enhance the model s learning efficiency and generalization capability.Experimental validation,conducted in left-turn scenarios at intersections,shows that with the aid of the driving intervention mechanism,the model s performance improves more rapidly in the initial training phase.Moreover,driving interventions when encountering local optima further enhance the model s performance,with even more significant improvements observed in complex scenarios.
作者 时高松 赵清海 董鑫 贺家豪 刘佳源 Shi Gaosong;Zhao Qinghai;Dong Xin;He Jiahao;Liu Jiayuan(College of Mechanical&Electrical Engineering,Qingdao University,Qingdao Shandong 266071,China)
出处 《计算机应用研究》 CSCD 北大核心 2024年第9期2732-2736,共5页 Application Research of Computers
基金 国家自然科学基金资助项目(52175236)。
关键词 自动驾驶 深度强化学习 特征编码 驾驶干预 经验回放 autonomous driving deep reinforcement learning feature encoding driving intervention experience replay
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部