Traffic Signal Timing via Deep Reinforcement Learning 被引量：60

Traffic Signal Timing via Deep Reinforcement Learning

下载PDF

导出

摘要 In this paper, we propose a set of algorithms to design signal timing plans via deep reinforcement learning. The core idea of this approach is to set up a deep neural network(DNN) to learn the Q-function of reinforcement learning from the sampled traffic state/control inputs and the corresponding traffic system performance output. Based on the obtained DNN,we can find the appropriate signal timing policies by implicitly modeling the control actions and the change of system states.We explain the possible benefits and implementation tricks of this new approach. The relationships between this new approach and some existing approaches are also carefully discussed. In this paper, we propose a set of algorithms to design signal timing plans via deep reinforcement learning. The core idea of this approach is to set up a deep neural network (DNN) to learn the Q-function of reinforcement learning from the sampled traffic state/control inputs and the corresponding traffic system performance output. Based on the obtained DNN, we can find the appropriate signal timing policies by implicitly modeling the control actions and the change of system states. We explain the possible benefits and implementation tricks of this new approach. The relationships between this new approach and some existing approaches are also carefully discussed. © 2014 Chinese Association of Automation.

作者 Li Li Yisheng Lv Fei-Yue Wang

机构地区 IEEE Department of Automation Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies State Key Laboratory of Management and Control for Complex Systems

出处《IEEE/CAA Journal of Automatica Sinica》 SCIE EI 2016年第3期247-254,254+248-253,共8页 自动化学报（英文版）

基金 supported by National Natural Science Foundation of China(6153301971232006,61233001)

关键词 Traffic control reinforcement learning deeplearning deep reinforcement learning Algorithms Timing circuits Traffic control Traffic signals

分类号 U491.54 [交通运输工程—交通运输规划与管理]

引文网络
相关文献

参考文献2

1王飞跃.平行控制:数据驱动的计算控制方法[J].自动化学报,2013,39(4):293-302. 被引量：126
2王飞跃.平行系统方法与复杂系统的管理和控制[J].控制与决策,2004,19(5):485-489. 被引量：311

二级参考文献44

1王飞跃,李乐飞,黄星,邹余敏.关于长周期连续安全节能有效生产基础理论的探讨[J].计算机与应用化学,2007,24(12):1711-1713. 被引量：15
2王飞跃.平行系统方法与复杂系统的管理和控制[J].控制与决策,2004,19(5):485-489. 被引量：311
3王飞跃.关于复杂系统研究的计算理论与方法[J].中国基础科学,2004,6(5):3-10. 被引量：95
4王飞跃.词计算和语言动力学系统的基本问题和研究[J].自动化学报,2005,31(6):844-852. 被引量：33
5王飞跃.关于复杂系统的建模、分析、控制和管理[J].复杂系统与复杂性科学,2006,3(2):26-34. 被引量：62
6[26]王飞跃.从一无所有到万象所归:人工社会与复杂系统研究[N].科学时报(纵横版),2003-03-17.
7[1]Saridis G N. Self-organizing Control of Stochastic Systems[M]. New York: Marcel Dekker Inc, 1977.
8[2]Fel′dbaum A A. Optimal Control Systems[M]. New York: Academic Press, 1965.
9[3]Langton C G. Studying artificial life with cellular automata, evolution, games, and learning: Models for adaptation in machines and nature[A]. Elsevier Science[C]. Amsterdam, 1987.
10[4]Epstein J M, Axtell R L. Growing Artificial Societies:Social Science from the Bottom up [M]. New York:Brooking Institute Press and the MIT Press, 1996.

共引文献352

1郭超,鲁越,林懿伦,卓凡,王飞跃.平行艺术:人机协作的艺术创作[J].智能科学与技术学报,2019,0(4):335-341. 被引量：11
2吕宜生,王飞跃,张宇,张晓东.虚实互动的平行城市:基本框架、方法与应用[J].智能科学与技术学报,2019,1(3):311-317. 被引量：14
3杨超,高玉,艾云峰,田滨,陈龙,王健,王飞跃.端对端平行无人矿山系统及其关键技术[J].智能科学与技术学报,2019,1(3):228-240. 被引量：14
4宁滨.平行轨道交通系统[J].智能科学与技术学报,2019,0(3):215-218. 被引量：4
5丁文文,王帅,李娟娟,袁勇,欧阳丽炜,王飞跃.去中心化自治组织:发展现状、分析框架与未来趋势[J].智能科学与技术学报,2019,0(2):202-213. 被引量：33
6侯家琛,董西松,熊刚,张俊,谭珂.平行核电:迈向智慧核电的智能技术[J].智能科学与技术学报,2019,0(2):192-201. 被引量：10
7白天翔,沈震,刘雅婷,董西松.平行机器:一种智能机器的管理与控制框架[J].智能科学与技术学报,2019,0(2):181-191. 被引量：4
8魏立,王红,黄敏,魏群,缪青海.平行海上油气田:基于ACP的前期开发方案的设计与评估[J].智能科学与技术学报,2019,1(2):118-124. 被引量：2
9刘腾,王晓,邢阳,高玉,田滨,陈龙.基于数字四胞胎的平行驾驶系统及应用[J].智能科学与技术学报,2019,0(1):40-51. 被引量：14
10沈大勇,王晓,刘胜.平行装卸:迈向智慧物流的智能技术[J].智能科学与技术学报,2019,0(1):34-39. 被引量：2

同被引文献253

1张俊,王飞跃,方舟.社会能源:从社会中获取能源[J].智能科学与技术学报,2019,0(1):7-20. 被引量：13
2张钹.人工智能进入后深度学习时代[J].智能科学与技术学报,2019,0(1):4-6. 被引量：41
3郑南宁.人工智能新时代[J].智能科学与技术学报,2019,0(1):1-3. 被引量：56
4Yuhao Zhou,Bei Zhang,Chunlei Xu,Tu Lan,Ruisheng Diao,Di Shi,Zhiwei Wang,Wei-Jen Lee.A Data-driven Method for Fast AC Optimal Power Flow Solutions via Deep Reinforcement Learning[J].Journal of Modern Power Systems and Clean Energy,2020,8(6):1128-1139. 被引量：5
5王飞跃.人工社会、计算实验、平行系统——关于复杂社会经济系统计算研究的讨论[J].复杂系统与复杂性科学,2004,1(4):25-35. 被引量：230
6丛冬栋,王振家.基于FNNC的城市交通智能红绿灯控制系统[J].控制工程,2003,10(z1):14-17. 被引量：5
7柳祖鹏,刘守阳,李思君,孙剑.交通控制硬件在环仿真平台的开发与实现[J].交通信息与安全,2013,31(3):126-130. 被引量：8
8承向军,贺振欢,杨肇夏.基于遗传算法的交通信号机器学习控制方法[J].系统工程理论与实践,2004,24(8):130-135. 被引量：13
9王飞跃.平行系统方法与复杂系统的管理和控制[J].控制与决策,2004,19(5):485-489. 被引量：311
10王飞跃.关于复杂系统研究的计算理论与方法[J].中国基础科学,2004,6(5):3-10. 被引量：95

引证文献60

1吕宜生,王飞跃,张宇,张晓东.虚实互动的平行城市:基本框架、方法与应用[J].智能科学与技术学报,2019,1(3):311-317. 被引量：14
2吕宜生,陈圆圆,金峻臣,李镇江,叶佩军,朱凤华.平行交通:虚实互动的智能交通管理与控制[J].智能科学与技术学报,2019,1(1):21-33. 被引量：25
3夏新海.多Agent强化学习下的城市路网自适应交通信号协调配时决策研究综述[J].交通运输研究,2017,3(2):17-23. 被引量：2
4刘昕,王晓,张卫山,汪建基,王飞跃.平行数据:从大数据到数据智能[J].模式识别与人工智能,2017,30(8):673-681. 被引量：39
5莫红,郝学新.时变论域下红绿灯配时的语言动力学分析[J].自动化学报,2017,43(12):2202-2212. 被引量：4
6Gang Bao,Yuanyuan Chen,Siyu Wen,Zhicen Lai.Stability Analysis for Memristive Recurrent Neural Network and Its Application to Associative Memory[J].自动化学报,2017,43(12):2244-2252. 被引量：2
7陈希亮,曹雷,何明,李晨溪,徐志雄.深度逆向强化学习研究综述[J].计算机工程与应用,2018,54(5):24-35. 被引量：18
8殷林飞,余涛.基于深度Q学习的强鲁棒性智能发电控制器设计[J].电力自动化设备,2018,38(5):12-19. 被引量：14
9夏新海.交互协调强化学习下的城市交通信号配时决策[J].计算机工程与应用,2018,54(11):265-270. 被引量：3
10Chen Lv,Dongpu Cao,Yifan Zhao,Daniel J. Auger,Mark Sullman,Huaji Wang,Laura Millen Dutka,Lee Skrypchuk,Alexandros Mouzakitis.Analysis of Autopilot Disengagements Occurring During Autonomous Vehicle Testing[J].IEEE/CAA Journal of Automatica Sinica,2018,5(1):58-68. 被引量：16

二级引证文献290

1王飞跃,王艳芬,陈薏竹,田永林,齐红威,王晓,张卫山,张俊,袁勇.联邦生态:从联邦数据到联邦智能[J].智能科学与技术学报,2020,2(4):305-311. 被引量：22
2李浥东,张俊,陶耀东,王伟,顾元祥,王飞跃.平行安全:基于CPSS的生成式对抗安全智能系统[J].智能科学与技术学报,2020(2):194-202. 被引量：6
3苏宏业,周泽,刘之涛,张立炎.电动汽车智能动态无线充电系统的研究现状与展望[J].智能科学与技术学报,2020,2(1):1-9. 被引量：7
4侯家琛,董西松,熊刚,张俊,谭珂.平行核电:迈向智慧核电的智能技术[J].智能科学与技术学报,2019,0(2):192-201. 被引量：10
5阳雨妍,宋爱国,沈书馨,李会军.基于CNN-GRU的遥操作机器人操作者识别与自适应速度控制方法[J].仪器仪表学报,2021,42(3):123-131. 被引量：16
6何苗,沈大勇,王涛,邹玉,黄山,李济廷.基于ACP方法的平行人力资源管理框架[J].网络安全与数据治理,2023,42(S02):17-25.
7罗亚波,李鑫.考虑识别鲁棒性和虹膜颜色影响的瞳孔精准定位方法[J].南京大学学报（自然科学版）,2024,60(1):97-105.
8吴漾,王鹏宇,缪新萍,柳林溪,田钺.基于改进深度强化学习算法的电网缺陷文本挖掘模型研究[J].科技通报,2021,37(2):47-55. 被引量：4
9高金.我国自动驾驶车辆应用及风险情况调查研究[J].交通运输部管理干部学院学报,2023,33(1):6-11.
10许杨子,强文,刘俊,孙鸿雁,胡成刚.基于改进深度强化学习算法的电力市场监测模型研究[J].国外电子测量技术,2020,39(1):82-87. 被引量：4

1曹丰锋.介绍DNV燃油化验知识[J].海运科技,1998(2):36-37.
2王屏,L. S. Jones,杨群,S. Gurupackiam.Cell transmission model based traffic signal timing in oversaturated conditions[J].Journal of Central South University,2013,20(4):1129-1136. 被引量：1
3顾博文.大二：Learn Smart[J].中国大学生就业,2008(17):9-10.
4博乐[J].中国公路,2015,0(19):33-33.
5野兽SpeedX Leopard Pro入围2016 IDEA大奖[J].中国自行车,2016,0(8):20-20.
6LEARNING FROM THE BEST 向最佳学习[J].船艇,2010(2):50-50.
7Z流行炫彩贴膜给笔记本换新年装[J].移动信息,2012(1):17-17.
8阡陌.微宏登陆德国汉诺威带来电动汽车动力系统解决方案[J].汽车与配件,2016,0(48):38-39.
9Guo Xiangang Jiang Zhida.Understanding between Peoples： Misconceptions and Approaches[J].China International Studies,2015,35(6):76-90.
10Xinxin LI Baohua MAO Xuesong FENG.Analysis on the Effect of Beijing＇s Traffic Control Policies[J].International Journal of Technology Management,2013(4):93-97.

IEEE/CAA Journal of Automatica Sinica

2016年第3期

浏览历史

内容加载中请稍等...