期刊文献+

基于改进DDPG的无人驾驶避障跟踪控制 被引量:6

Driverless Obstacle Avoidance and Tracking Control Based on Improved DDPG
下载PDF
导出
摘要 无人驾驶汽车在跟踪避障控制过程中,被控对象具有非线性特征且被控参数多变,线性模型及固定的无人驾驶车辆数学模型难以保证车辆在复杂环境下的安全性和稳定性,并且无人驾驶离散化控制过程增加了控制难度。针对此类问题,为提高无人驾驶汽车实时控制跟踪轨迹的精度,同时降低整个控制过程的难度,文中提出了一种基于蒙特卡洛-深度确定性策略梯度(MC-DDPG)的无人驾驶汽车避障跟踪控制算法。该算法基于深度强化学习网络搭建控制系统模型,在策略学习采样过程中采用优秀的训练样本,使用蒙特卡洛方法优化网络训练梯度,对算法的训练样本采取优劣区分,使用优异的样本通过梯度算法寻找最优的网络参数,从而增强网络算法的学习能力,实现无人驾驶汽车的更优连续控制。在计算机模拟环境TORCS中对该算法进行仿真实验,结果表明,应用MC-DDPG算法可以有效地实现无人驾驶汽车的避障跟踪控制,其控制的无人驾驶汽车的跟踪精度及避障效果均优于深度Q网络算法和DDPG算法。 In the process of tracking and obstacle avoidance control of driverless vehicles, the controlled object hasnonlinear characteristics and variable control parameters. The linear model and the fixed mathematical model ofdriverless vehicles are difficult to ensure the safety and stability of the vehicle in complex environments, and thedriverless discrete control process increases the difficulty of control. To address such problems, in order to improvethe accuracy of real-time control tracking trajectory of driverless vehicles, and at the same time reduce the difficultyof the whole control process, the paper proposed a Monte Carlo-depth deterministic policy gradient-based obstacleavoidance tracking control algorithm for driverless vehicles. The algorithm builds a control system model based ona deep reinforcement learning network, and adopts excellent training samples in the strategy learning samplingprocess. It optimizes the network training gradient with the Monte Carlo method, and makes a distinction betweengood and bad training samples for the algorithm. The excellent samples are used to find the optimal networkparameters through a gradient algorithm, so as to enhance the learning ability of the network algorithm and realize abetter and continuous control of the driverless vehicle. Simulation experiments of the control method were carriedout in the computer simulation environment TORCS. The results show that the proposed improved DDPG algorithmcan be applied to effectively achieve the obstacle avoidance tracking control of the driverless vehicle, and thetracking accuracy and obstacle avoidance effect of the unmanned car under its control is better than that of the deepQ network algorithm and the DDPG algorithm.
作者 李新凯 虎晓诚 马萍 张宏立 LI Xinkai;HU Xiaocheng;MA Ping;ZHANG Hongli(School of Electrical Engineering,Xinjiang University,Urumqi 830017,Xinjiang,China)
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2023年第11期44-55,共12页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(62263030) 新疆维吾尔自治区自然科学基金青年科学基金资助项目(2022D01C86)。
关键词 无人驾驶 动态避障 深度确定性策略梯度 轨迹跟踪 梯度优化 self-driving dynamic obstacle avoidance depth deterministic policy gradient trajectory tracking gradient optimization
  • 相关文献

参考文献16

二级参考文献159

共引文献457

同被引文献70

引证文献6

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部