摘要
为实现高速公路环境下车辆的安全决策,提出一种结合深度强化学习和风险矫正方法的行为决策模型。构建决策模型所需的目标车辆及周围车辆的行驶信息,并引入自注意力安全机制,提高车辆在复杂高速场景下对周围潜在危险车辆的注意力,综合考虑行车效率、避障等因素以设计强化学习的奖励函数。此外,为解决强化学习在决策过程中缺乏安全性保障的问题,设计风险矫正模块对决策动作进行风险评估和矫正,避免危险决策的执行。在Highway-env仿真平台上对提出的决策模型进行训练和测试。试验结果表明,提出的决策模型有较高的行车安全率和鲁棒性,其驾驶效率也优于以规则、模仿学习和单纯深度强化学习为基础的决策方法。
To ensure secure decision-making for vehicles on highways,the paper proposes a decision-making model that combines deep reinforcement learning(DRL) with a risk correction method.Firstly,the driving data from the target vehicle and its surrounding vehicles is collected,which is essential for the decisionmaking model.And the attention mechanism is introduced to improve vehicle's awareness of potentially dangerous vehicles in its surroundings,particularly in complex high-speed scenarios.Then the reward function of the reinforcement learning is designed considering factors such as travel efficiency and obstacle avoidance.In addition,to address the lack of security assurance in the decision-making process of reinforcement learning,the paper proposes a risk correction module,which performs risk assessments and corrections to avoid the execution of potentially dangerous actions.Finally,the proposed decision-making model is trained and validated on the Highway-env simulation platform.The evaluation results show that the proposed approach exhibits better driving safety and robustness.In terms of driving efficiency,it also surpasses the rule-based method,imitation learning and the pure DRL algorithm.
作者
詹吟霄
刘潇
梁军
ZHAN Yinxiao;LIU Xiao;LIANG Jun(China State Key Laboratory of Industrial Control Technology,Zhejiang University,Hangzhou 310058,China)
出处
《汽车工程学报》
2023年第5期656-667,共12页
Chinese Journal of Automotive Engineering
基金
国家重点研发计划项目(2019YFB1600500)。
关键词
自动驾驶
深度强化学习
决策模型
风险矫正
注意力机制
奖励函数
autonomous driving
deep reinforcement learning
decision-making model
risk correction
attention mechanism
reward function