摘要
为精准复现路段非机动车干扰行为,满足自动驾驶仿真测试需求,本文提出一种位置奖励增强的生成对抗模仿学习(Position Reward Augmented Generative Adversarial Imitation Learning,PRA-GAIL)方法训练仿真模型。城市道路中,干扰行为主要由电动自行车产生,故以电动自行车作为研究对象。在构建的仿真环境中,使用生成对抗模仿学习(GAIL)更新仿真模型使仿真轨迹逐步逼近真实轨迹,同时加入位置奖励与Lagrangian约束方法以解决现有仿真方法中的均质化和行为不可控的问题。结果表明:在测试集表现上,GAIL和PRA-GAIL方法平均每步长距离误差相比于常用的行为克隆方法下降了61.7%和65.8%。在行为层仿真精度上,与GAIL相比,PRAGAIL的加速度分布与真实分布间的KL散度显著降低,越线、超车数量的百分比误差下降了7.2%和20.2%。使用Lagrangian方法添加安全约束使有危险行为的智能体数量相比于常用的奖励增强方法下降了75.8%。在轨迹层仿真精度上,整体仿真环境下,PRA-GAIL的平均每步长距离误差相比于GAIL下降了17.5%。本文模型真实再现了非机动车超车时的操作空间,说明PRAGAIL方法对非机动车行为仿真有良好的适用性。本文提出的改动有效提升了仿真效果,最终所得的仿真模型能够真实地再现路段非机动车的干扰行为,能够应用于自动驾驶仿真测试。
In order to accurately reproduce the interaction behavior of bicycles to meet the needs of autonomous driving simulation testing,a Position Reward Augmented Generative Adversarial Imitation Learning(PRA-GAIL)method is proposed.In urban roads,since the disturbance behavior is mainly generated by electric bicycles,electric bicycles are selected as the research object.In the constructed simulation environment,Generative Adversarial Imitation Learning(GAIL)is used to make the simulated trajectories approximate the real trajectories,while Position Reward and Lagrangian Constraint methods are added to solve the homogenization and uncontrollable behaviors of existing simulation methods.In the test set validation,the average displacement error of the GAIL and PRA-GAIL methods decreased by 61.7%and 65.8%,respectively,compared to the behavioral cloning method.In the behavioral performance validation,the KL divergence of acceleration distributions between simulation and reality was significantly reduced in PRA-GAIL compared to GAIL,and the percentage error of overtaking and illegal lane-changing behaviors decreased by 7.2%and 20.2%,respectively.Using the Lagrangian method to add constraints resulted in a 75.8%reduction in the number of agents with risky behavior compared to commonly used reward augmentation methods.In trajectory validation,in the simulation environment,the average displacement error of PRA-GAIL is reduced by 17.5%compared to GAIL.The resulting model realistically reproduces the overtaking maneuver space of cyclists.The results show that the method adopted in this paper is suitable for bicycle behavior simulation,the proposed modifications effectively enhance the simulation performance,and the obtained simulation model accurately reproduces the disturbance behavior of bicycles on road segments,which can be applied to automated vehicle simulation tests.
作者
魏书樵
倪颖
孙剑
邱红桐
WEI Shuqiao;NI Ying;SUN Jian;QIU Hongtong(Key Laboratory of Road and Traffic Engineering of the Ministry of Education,Tongji University,Shanghai 201804,China;Traffic Management Research Institute of the Ministry of Public Security,Wuxi 214151,Jiangsu,China)
出处
《交通运输系统工程与信息》
EI
CSCD
北大核心
2024年第4期105-115,共11页
Journal of Transportation Systems Engineering and Information Technology
基金
国家重点研发计划(2019YFB1600200)
国家自然科学基金(52072262)。
关键词
交通工程
非机动车行为
强化学习
生成对抗模仿学习
自动驾驶测试
微观交通仿真
traffic engineering
bicycle behavior
reinforcement learning
generative adversarial imitation learning
automatic vehicle test
micro traffic simulation