基于强化学习的自动驾驶联合训练方法

Joint Training Method for Autonomous Driving Based on Reinforcement Learning

下载PDF

导出

摘要自动驾驶控制算法对自动驾驶至关重要,为了设计适应多种环境更高效的自动驾驶控制算法,提出了基于强化学习的联合训练方法。自动驾驶控制算法应该在各种道路环境、各种天气环境、各种场景下都可以稳定的运行,因此通过人工智能的方法设计自动驾驶算法时必须要考虑到各种场景。基于TORCS仿真软件设计了基于强化学习的联合训练方法,包括使用神经网络拟合动作、状态空间,设置训练策略、奖励函数等机制。同时通过在不同环境下设置多个智能体进行强化学习训练,并设计了联合训练的算法,实现多个智能体在不同环境下进行联合训练,不同智能体共享相互学习到的经验,提高了模型的泛化性。所设计的联合训练方法实现了多个强化学习智能体的联合训练,并通过了实验验证,达到了高效、稳定的训练策略。 Automatic driving control algorithm is very important for automatic driving.In order to design more efficient automatic driving control algorithm adapted to various environments,this paper proposes a joint training method based on reinforcement learning.The automatic driving control algorithm should be able to run stably in various road environments,weather environments and scenarios.Therefore,to design an automatic driving algorithm through artificial intelligence,various scenarios must be taken into account.Based on TORCS simulation software,this paper designs a joint training method based on reinforcement learning,including using neural network to fit action and state space,setting training strategy,reward function and other mechanisms.At the same time,multiple agents are set up in different environments for reinforcement learning training,and the algorithm of joint training is designed to realize the joint training of multiple agents in different environments,and different agents share the experience learned from each other,which improves the generalization of the model.The joint training method designed in this paper realizes the joint training of multiple reinforcement learning agents,and is verified by experiments,and achieves an efficient and stable training strategy.

作者陈恒星刘一鸣 Chen Hengxing;Liu Yiming(Business School,Macao University of Science and Technology,Macao 999078,China;Business School,Sun Yat-sen University,Guangzhou 510006,China)

机构地区澳门科技大学商学院中山大学管理学院

出处《机电工程技术》 2024年第3期131-135,共5页 Mechanical & Electrical Engineering Technology

关键词强化学习自动驾驶人工智能 reinforcement learning autonomous driving artificial intelligence

分类号 TP18 [自动化与计算机技术—控制理论与控制工程] U471 [机械工程—车辆工程]

引文网络
相关文献

参考文献16

1王洪升,曾连荪,田蔚风.人工智能在车辆自动驾驶中的应用[J].自动化技术与应用,2004,23(6):5-7. 被引量：6
2晏欣炜,朱政泽,周奎,彭彬.人工智能在汽车自动驾驶系统中的应用分析[J].湖北汽车工业学院学报,2018,32(1):40-46. 被引量：29
3王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(1):55-69. 被引量：80
4曾啸川,邓红卫,莫岚淋,陈一楠,贺迪.基于TensorFlow深度学习自动驾驶小车的设计[J].数字技术与应用,2020,38(7):131-134. 被引量：7
5王丙琛,司怀伟,谭国真.基于深度强化学习的自动驾驶车控制算法研究[J].郑州大学学报（工学版）,2020,41(4):41-45. 被引量：19
6朱敏慧.人工智能推进自动驾驶研发进程[J].汽车与配件,2018,0(20):38-39. 被引量：5
7王宇霄,刘敬玉,李忠飞,朱凤华.基于强化学习与安全约束的自动驾驶决策方法[J].交通运输研究,2023,9(1):31-39. 被引量：3
8韩向敏,鲍泓,梁军,潘峰,玄祖兴.一种基于深度强化学习的自适应巡航控制算法[J].计算机工程,2018,44(7):32-35. 被引量：13
9扶文远.人工智能在车辆自动驾驶中的应用实践探索[J].交通科技与管理,2021(29):0069-0070. 被引量：1
10赵一兵,邢淑勇,刘昌华,李宾,王威淇,王海玮.复杂交通场景下自动驾驶道路目标检测[J].应用科技,2022,49(4):1-6. 被引量：2

二级参考文献77

1王景武,金立生.车辆自适应巡航控制系统控制技术的发展[J].汽车技术,2004(7):1-4. 被引量：19
2魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量：19
3高阳,周如益,王皓,曹志新.平均奖赏强化学习算法研究[J].计算机学报,2007,30(8):1372-1378. 被引量：38
4Pan Zhigeng, Cheok, Adrian David, et al. Virtual reality and mixed reality for virtual learning environments[J ]. Computers & Grpahie,2006,30(1) :20 - 28.
5Guo Tiantai, Zhou Xiaojun, Zhu Genxin. Application of cbr in VR- based test and simulation system [ C ]//Proceedings of 2003 International Conference on Machine Learning and Cybernetics. Xi - an: Springer Berlin Heidelberg, 2003:2 337 - 2 340.
6Vicente Marti Centelles. Build your track for TORCS in 20 min utes[EB/OL]. [2007 - 04 - 01 ]. http://usuarios. multimania. es/fltorcs/build _your_ trocs_ track_ in _20 _minutes .pdf.
7Wilson J R, Cruz M D. Virtual and interactive environments for work of the future[J]. International Journal of Human- Computer Studies, 2006 (3) : 158 - 169.
8Guo Tiantai. Research on the Theory and Applications of VR - based Testing[ D]. Hangzhou: Zhejiang University, 2005.
9JIE LIU, HAI LAN, An algorithm using predictiye control theory for automatic drive[A]. Machine Learning and Cybernetics, 2002. Proceedings. 2002International Conference[C]. on, Volume: 3,4 - 5 Nov. 2002:1601 - 1604
10SOTELO, M. A., ALCALDE, S., REVIEJO, J, Vehiele fuzzy driving based on DGPS and vision[A]. IFSA World Congress and 20th NAFIPSInternational Conference, 2001. Joint 9th[C], Volume: 3,25 - 28July2001:1472 - 1477

共引文献710

1傅汇乔,唐开强,邓归洲,王鑫鹏,陈春林.基于深度强化学习的六足机器人运动规划[J].智能科学与技术学报,2020(4):361-371. 被引量：3
2刘朝阳,穆朝絮,孙长银.深度强化学习算法与应用研究现状综述[J].智能科学与技术学报,2020(4):314-326. 被引量：46
3陈财森,纪伯公,黄辰,向阳霞.基于联邦学习的作战数据共享与隐私保护[J].装甲兵学报,2022(1):98-103. 被引量：4
4韩志豪,汪益兵,张宇,郝永志.基于深度强化学习的船舶航线自动规划[J].中国航海,2021,44(1):100-105. 被引量：9
5赵健,宋东鉴,朱冰,刘斌,陈志成,张培兴.基于自学习和监督学习混合驱动的智能汽车跟驰控制策略[J].中国公路学报,2022,35(3):55-65. 被引量：4
6张磊,母亚双,潘泉.基于改进深度双Q网络的移动机器人路径规划算法[J].信息与控制,2024,53(3):365-376. 被引量：1
7李茹杨,彭慧民,李仁刚,赵坤.强化学习算法与应用综述[J].计算机系统应用,2020,29(12):13-25. 被引量：45
8朱文鹏,郭峰,平作为,梁英杰,兰儒恺,张永.基于隐私保护的无监督电机磁瓦表面缺陷检测研究[J].控制工程,2023,30(7):1219-1225.
9周瑶瑶,李烨.基于排序优先经验回放的竞争深度Q网络学习[J].计算机应用研究,2020,37(2):486-488. 被引量：7
10李逊,李俊超,邓林忠,康旭云,欧启捷,劳恒辉.人工智能优化技术在钢筋混凝土结构的应用[J].建筑结构,2023,53(S02):1425-1430. 被引量：1

1顾扬,程玉虎,王雪松.基于优先采样模型的离线强化学习[J].自动化学报,2024,50(1):143-153. 被引量：1
2张中伟,李艺,高增恩,武照云.基于深度强化学习的柔性作业车间节能调度研究[J].工业工程,2024,27(1):78-85. 被引量：1
3范腾,杨浩,尹稳,周冬明.基于神经辐射场的多尺度视图合成研究[J].图学学报,2023,44(6):1140-1148. 被引量：5
4李新凯,虎晓诚,马萍,张宏立.基于改进DDPG的无人驾驶避障跟踪控制[J].华南理工大学学报（自然科学版）,2023,51(11):44-55. 被引量：5

机电工程技术

2024年第3期

浏览历史

内容加载中请稍等...

基于强化学习的自动驾驶联合训练方法

参考文献16

二级参考文献77

共引文献710

相关作者

相关机构

相关主题

浏览历史