摘要
针对智能车人机共融驾驶系统中人和自主驾驶系统的驾驶权连续动态分配问题,尤其是因建模误差导致的权重分配方法适应性低的难题,提出了基于强化学习的人机共融转向驾驶决策方法;考虑驾驶人的转向特性,搭建了基于双点预瞄的驾驶人模型,并采用预测控制理论建立了智能车自主转向控制模型,构建了智能车人机同时在环的转向控制框架;基于Actor-Critic强化学习架构,设计了用于人机驾驶权分配的深度确定性策略梯度(DDPG)智能体,以曲率契合度、跟踪精确性和乘坐舒适性为目标,提出了基于模型的收益函数;构建了人机共融驾驶权分配强化学习框架,包含驾驶人模型、自主转向模型、驾驶权分配智能体以及收益函数;为了验证方法的有效性,招募了8位驾驶人开展共计48人次的模拟驾驶试验。研究结果表明:在曲率适应性验证中,人机共融-DDPG方法优于人工驾驶和人机共融-Fuzzy方法,跟踪性平均提升70.69%、39.67%,舒适性平均提升18.34%、7.55%;在速度适应性验证中,车速为40、60和80 km·h条件下,驾驶人权重大于0.5的时间占比分别为90.00%、85.76%、60.74%,且跟踪性相轨迹和舒适性相轨迹都能有效收敛。可见,提出的方法能够适应曲率和车速变化,在保证安全性的前提下提升了跟踪性和舒适性。
In terms of the continuous dynamic allocation problem of driving weights between human and autonomous driving systems in the human-machine integration(HMI) driving system of intelligent vehicles, especially the low adaptability problem of weight allocation methods caused by modeling errors, a HMI steering decision-making method based on the reinforcement learning was proposed. In view of drivers’ steering characteristics, a driver model based on the two-point preview was built, and an autonomous steering control model of intelligent vehicles was established by adopting the predictive control theory. On this basis, a steering control framework of simultaneous human-machine in-loop for intelligent vehicles was constructed. According to the Actor-Critic reinforcement learning framework, a deep deterministic policy gradient(DDPG) agent for the human-machine driving weight allocation was designed, and a model-based gain function was proposed with the curvature adaptability, tracking accuracy, and ride comfort as targets. A reinforcement learning framework for the HMI driving weight allocation was constructed, which contains a driver model, an autonomous steering model, a driving weight allocation agent, and a gain function. To verify the effectiveness of the proposed method, eight drivers were recruited, and a total of 48 simulated driving experiments were carried out. Research results show that in the verification of curvature adaptability, the HMI-DDPG method is superior to the manned driving and HMI-Fuzzy methods. The trackability improves by an average of 70.69% and 39.67%, respectively, and the comfortability increases by an average of 18.34% and 7.55%, respectively. In the verification of speed adaptability, under the conditions of a vehicle speed of 40, 60, and 80 km·h, the time proportion is 90.00%, 85.76%, and 60.74%, respectively, when the driver’s weight is greater than 0.5. The phase trajectories of both the trackability and the comfort can effectively converge. Therefore, the proposed method can adapt to changes in curvature and vehicle speed and improve the trackability and comfort on the premise of ensuring safety. 5 tabs, 14 figs, 31 refs.
作者
吴超仲
冷姚
陈志军
罗鹏
WU Chao-zhong;LENG Yao;CHEN Zhi-jun;LUO Peng(Intelligent Transportation Systems Research Center,Wuhan University of Technology,Wuhan 430063,Hubei,China;School of Transportation and Logistics Engineering,Wuhan University of Technology,Wuhan 430063,Hubei,China;School of Computer Science and Artificial Intelligence,Wuhan University of Technology,Wuhan 430063,Hubei,China)
出处
《交通运输工程学报》
EI
CSCD
北大核心
2022年第3期55-67,共13页
Journal of Traffic and Transportation Engineering
基金
国家自然科学基金项目(52172394)
国家重点研发计划(2018YFB1600600)
湖北省科技重大专项(2020AAA001)。
关键词
智能车
人机共融
转向驾驶决策
驾驶权分配
强化学习
intelligent vehicle
human-machine integration
steering decision-making
driving weight allocation
reinforcement learning