摘要
利用Q学习发展出的D3QN模型来实现交通信号控制智能体,模型采用离散交通状态编码的状态集,将交叉口处车辆的位置-速度二维矩阵图经过卷积网络层进行特征提取,以捕捉更精确、完整的交叉口信息。分别基于相位切换策略和马尔科夫决策过程型(Markov decision process,MDP)动作策略,利用SUMO交通仿真软件进行模拟训练。结果表明,与传统的定时定序信号灯控制策略相比,相位切换策略下车辆的平均等待时间减少了约45%,而MDP动作策略下减少了约78%。
The D3QN model developed by Q-learning was used to realize traffic signal control agents.The model used the state set of discrete traffic state encoding to extract the features of the two-dimensional position velocity matrix of vehicles at intersections through convolution network layer,so as to capture more accurate and complete intersection information.Based on phase switching strategy and Markov decision process(MDP)action strategy,the SUMO traffic simulation software was used for simulation training.The experimental results showed that the average vehicle waiting time under the phase switching strategy was reduced by about 45%and the MDP action strategy was reduced by about 78%compared with the traditional timing and sequencing signal control strategy.
作者
宋国治
苏鹏博
刘畅
陈玉格
SONG Guozhi;SU Pengbo;LIU Chang;CHEN Yuge(School of Computer Science and Technology,Tiangong University,Tianjin 300387,China)
出处
《郑州大学学报(理学版)》
CAS
北大核心
2022年第5期57-63,共7页
Journal of Zhengzhou University:Natural Science Edition
基金
国家自然科学基金项目(61972456)。