End-to-End Autonomous Driving Through Dueling Double Deep Q-Network 被引量：10

导出

摘要 Recent years have seen the rapid development of autonomous driving systems,which are typically designed in a hierarchical architecture or an end-to-end architecture.The hierarchical architecture is always complicated and hard to design,while the end-to-end architecture is more promising due to its simple structure.This paper puts forward an end-to-end autonomous driving method through a deep reinforcement learning algorithm Dueling Double Deep Q-Network,making it possible for the vehicle to learn end-to-end driving by itself.This paper firstly proposes an architecture for the end-to-end lane-keeping task.Unlike the traditional image-only state space,the presented state space is composed of both camera images and vehicle motion information.Then corresponding dueling neural network structure is introduced,which reduces the variance and improves sampling efficiency.Thirdly,the proposed method is applied to The Open Racing Car Simulator(TORCS)to demonstrate its great performance,where it surpasses human drivers.Finally,the saliency map of the neural network is visualized,which indicates the trained network drives by observing the lane lines.A video for the presented work is available online,https://youtu.be/76ciJ mIHMD8 or https://v.youku.com/v_show/id_XNDM4 ODc0M TM4NA==.html.

作者 Baiyu Peng Qi Sun Shengbo Eben Li Dongsuk Kum Yuming Yin Junqing Wei Tianyu Gu

机构地区 State Key Lab of Automotive Safety and Energy Korea Advanced Institute of Science and Technology DiDi Autonomous Driving Company

出处《Automotive Innovation》 EI CSCD 2021年第3期328-337,共10页 汽车创新工程（英文）

基金 This work is supported by the National Key Research and Development Project of China under Grant 2018YFB1600600 Beijing Natural Science Foundation with JQ18010.The authors should also thank the support from Tsinghua University-Didi Joint Research Center for Future Mobility.

关键词 End-to-end autonomous driving Reinforcement learning Deep Q-network Neural network

分类号 U46 [机械工程—车辆工程]

引文网络
相关文献

参考文献3

1李升波,关阳,侯廉,高洪波,段京良,梁爽,汪玉,成波,李克强,任伟,李骏.深度神经网络的关键技术及其在自动驾驶领域的应用[J].汽车安全与节能学报,2019,10(2):119-145. 被引量：29
2Yuming Yin,Shengbo Eben Li,Keqiang Li,Jue Yang,Fei Ma.Self-learning drift control of automated vehicles beyond handling limit after rear-end collision[J].Transportation Safety and Environment,2020,2(2):97-105. 被引量：1
3Yang Guan,Shengbo Eben Li,Jingliang Duan,Wenjun Wang,Bo Cheng.Markov probabilistic decision making of self-driving cars in highway with random traffic flow: a simulation study[J].Journal of Intelligent and Connected Vehicles,2018,1(2):77-84. 被引量：1

二级参考文献2

1李升波,徐少兵,王文军,成波.挡位离散型车辆经济性加速策略的伪谱法优化[J].自动化学报,2015,41(3):475-485. 被引量：4
2Keqiang LI,Feng GAO,Shengbo Eben LI,Yang ZHENG,Hongbo GAO.Robust cooperation of connected vehicle systems with eigenvalue-bounded interaction topologies in the presence of uncertain dynamics[J].Frontiers of Mechanical Engineering,2018,13(3):354-367. 被引量：4

共引文献28

1李国法,陈耀昱,吕辰,陶达,曹东璞,成波.智能汽车决策中的驾驶行为语义解析关键技术[J].汽车安全与节能学报,2019,10(4):391-412. 被引量：6
2刘禹彤.基于共词分析的国内自动驾驶研究可视化分析[J].通讯世界,2020,27(4):171-173.
3王玉龙,裴锋,刘文如,闫春香,周卫林,李智.基于开关式深度神经网络的拟人化自动驾驶决策算法[J].中国机械工程,2021,32(6):689-696. 被引量：2
4李文礼,张友松,韩迪,钱洪,石晓辉.基于深度强化学习的车辆自主避撞决策控制模型[J].汽车安全与节能学报,2021,12(2):201-209. 被引量：7
5殷虎,曹旭.人工智能技术在船舶领域中的应用综述[J].舰船电子工程,2021,41(10):12-18. 被引量：2
6周恒平,牛志刚.基于极限学习机的驾驶员制动意图识别[J].汽车技术,2021(11):30-34. 被引量：2
7BAI Yanqiong,ZHENG Yufu,TIAN Hong.Semantic segmentation method of road scene based on Deeplabv3+ and attention mechanism[J].Journal of Measurement Science and Instrumentation,2021,12(4):412-422. 被引量：6
8段续庭,周宇康,田大新,郑坤贤,周建山,孙亚夫.深度学习在自动驾驶领域应用综述[J].无人系统技术,2021,4(6):1-27. 被引量：28
9孙恩鑫,殷玉明,辛喆,李升波,何举刚,孔周维,刘秀鹏.微小加速度下汽车质量-道路坡度自适应估计[J].清华大学学报（自然科学版）,2022,62(1):125-132. 被引量：6
10梁军,徐鹏,蔡英凤,陈龙,刘擎超.人机混驾环境下混行车辆雾模型研究[J].中国公路学报,2021,34(11):255-264. 被引量：2

同被引文献66

1王保云.物联网技术研究综述[J].电子测量与仪器学报,2009,23(12):1-7. 被引量：749
2刘强,崔莉,陈海明.物联网关键技术与应用[J].计算机科学,2010,37(6):1-4. 被引量：602
3孙其博,刘杰,黎羴,范春晓,孙娟娟.物联网:概念、架构与关键技术研究综述[J].北京邮电大学学报,2010,33(3):1-9. 被引量：1087
4裴晓飞,刘昭度,马国成,叶阳.汽车主动避撞系统的安全距离模型和目标检测算法[J].汽车安全与节能学报,2012,3(1):26-33. 被引量：69
5钱志鸿,王义君.物联网技术与应用研究[J].电子学报,2012,40(5):1023-1029. 被引量：387
6张文清,徐雪松,刘瑞.基于反馈线性化的四旋翼无人机姿态控制研究[J].计算机仿真,2019,36(1):87-91. 被引量：15
7陈成,何玉庆,卜春光,韩建达.基于四阶贝塞尔曲线的无人车可行轨迹规划[J].自动化学报,2015,41(3):486-496. 被引量：86
8卢兆麟,李升波,Schroeder Felix,周吉晨,成波.结合自然语言处理与改进层次分析法的乘用车驾驶舒适性评价[J].清华大学学报（自然科学版）,2016,56(2):137-143. 被引量：18
9李克强,戴一凡,李升波,边明远.智能网联汽车(ICV)技术的发展现状及趋势[J].汽车安全与节能学报,2017,8(1):1-14. 被引量：420
10虞棐雄,王永超,曹立佳,张胜修,扈晓翔.基于误差逼近器的巡航飞行器反步控制[J].航天控制,2017,35(4):26-32. 被引量：1

引证文献10

1冯耀,景首才,惠飞,赵祥模,刘建蓓.基于深度强化学习的智能网联车辆换道轨迹规划方法[J].汽车安全与节能学报,2022,13(4):705-717. 被引量：3
2曹凯,朱勇,高强,刘金华.深度强化学习在自动控制领域研究现状与展望[J].排灌机械工程学报,2023,41(6):638-648. 被引量：6
3林程,汪博文,吕沛原,宫新乐,于潇.面向变曲率道路的自动驾驶汽车换道博弈运动规划与协同控制研究[J].汽车工程,2023,45(7):1099-1111.
4李升波,占国建,蒋宇轩,兰志前,张宇航,邹文俊,陈晨,成波,李克强.类脑学习型自动驾驶决控系统的关键技术[J].汽车工程,2023,45(9):1499-1515. 被引量：3
5李升波,刘畅,殷玉明,段京良,王建强,李克强.汽车端到端自动驾驶系统的关键技术与发展趋势[J].人工智能,2023(5):1-16. 被引量：9
6何一超,寇胜杰,田贺,李昊,芦勇.面向量产的高速公路智能换道系统决策规划方法研究[J].汽车工程,2024,46(3):418-430. 被引量：1
7Wenbo Li,Guofa Li,Ruichen Tan,Cong Wang,Zemin Sun,Ying Li,Gang Guo,Dongpu Cao,Keqiang Li.Review and Perspectives on Human Emotion for Connected Automated Vehicles[J].Automotive Innovation,2024,7(1):4-44. 被引量：1
8Kang Yuan,Yanjun Huang,Shuo Yang,Zewei Zhou,Yulei Wang,Dongpu Cao,Hong Chen.Evolutionary Decision-Making and Planning for Autonomous Driving Based on Safe and Rational Exploration and Exploitation[J].Engineering,2024,33(2):108-120. 被引量：2
9LI Shuyi,LI Minzhe,JING Zhongliang.Multi-Agent Path Planning Method Based on Improved Deep Q-Network in Dynamic Environments[J].Journal of Shanghai Jiaotong university(Science),2024,29(4):601-612.
10梅华悦,唐华苹,邓继伟,付浩源.智慧物联网在智能网联汽车领域的应用与探讨[J].太赫兹科学与电子信息学报,2024,22(9):925-932.

二级引证文献25

1李升波,刘畅,殷玉明,段京良,王建强,李克强.汽车端到端自动驾驶系统的关键技术与发展趋势[J].人工智能,2023(5):1-16. 被引量：9
2杨贵永,张越垚,朱云尧,郑琪.全球自动驾驶汽车应用现状与趋势[J].汽车维护与修理,2024(1):62-65. 被引量：2
3王桢朗,何慧群,周军,金云飞.基于多智能体深度强化学习的多星观测任务分配方法[J].上海航天（中英文）,2024,41(1):108-115.
4袁镇华,茅大钧,李玉珍.基于注意力机制与XBOA-Bi-LSTM的离心式压缩机故障预警方法[J].机电工程,2024,41(3):400-408. 被引量：1
5何一超,寇胜杰,田贺,李昊,芦勇.面向量产的高速公路智能换道系统决策规划方法研究[J].汽车工程,2024,46(3):418-430. 被引量：1
6杜清运,况路路,任福,刘江涛,冯昶,陈卓宁,张浡聪,郑康,李智程.自动驾驶高精度地图特征分析及发展展望[J].地球信息科学学报,2024,26(1):15-24.
7孙腾超,陈焕明.基于深度强化学习的自主换道控制模型[J].农业装备与车辆工程,2024,62(4):30-34.
8何逸煦,林泓熠,刘洋,杨澜,曲小波.强化学习在自动驾驶技术中的应用与挑战[J].同济大学学报（自然科学版）,2024,52(4):520-531.
9沈跃,赵莎,张亚飞,何思伟,冯瑞,刘慧.基于变前视距离的四轮同步转向农机改进纯跟踪控制[J].农业机械学报,2024,55(3):21-28.
10沈甜雨,李志伟,范丽丽,张庭祯,唐丹丹,周美华,刘华平,王坤峰.具身智能驾驶:概念、方法、现状与展望[J].智能科学与技术学报,2024,6(1):17-32.

1李科.中外知名社交媒体传播方式对比研究[J].传播力研究,2020,4(30):9-10.
2朱敏慧.Innovusion鲍君威:蔚来ET7激光雷达量产,2022年为中国激光雷达上车元年[J].汽车与配件,2022(1):34-35. 被引量：1
3张之悍(编译).国际琴坛[J].钢琴艺术,2021(1):59-60.
4Yangyang Wang,Ding Pan,Hangyun Deng,Yuanxing Jiang,Zhiguang Liu.Dynamic Trajectory Planning of Autonomous Lane Change at Medium and Low Speeds Based on Elastic Soft Constraint of the Safety Domain[J].Automotive Innovation,2020,3(1):73-87. 被引量：2
5Qijie Zou,Kang Xiong,Qiang Fang,Bohan Jiang.Deep imitation reinforcement learning for self-driving by vision[J].CAAI Transactions on Intelligence Technology,2021,6(4):493-503. 被引量：2
6Junjie Wang,Qichao Zhang,Dongbin Zhao.Highway Lane Change Decision-Making via Attention-Based Deep Reinforcement Learning[J].IEEE/CAA Journal of Automatica Sinica,2022,9(3):567-569. 被引量：2
7Dong Zhang,Chen Lv,Tianci Yang,Peng Hang.Cyber-Attack Detection for Autonomous Driving Using Vehicle Dynamic State Estimation[J].Automotive Innovation,2021,4(3):262-273. 被引量：1
8Guang Chen,Kai Chen,Lijun Zhang,Liming Zhang,Alois Knoll.VCANet:Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection[J].Automotive Innovation,2021,4(4):400-412.
9Li Fangfang,Wang Zhiwei.Xinjiang A New Chapter[J].Beijing Review,2022,65(12):42-43.
10Tianjun Sun,Zhenhai Gao,Zhiyong Chang,Kehan Zhao.Brain-like Intelligent Decision-making Based on Basal Ganglia and Its Application in Automatic Car-following[J].Journal of Bionic Engineering,2021,18(6):1439-1451. 被引量：2

Automotive Innovation

2021年第3期

浏览历史

内容加载中请稍等...