A multi process value-based reinforcement learning environment framework for adaptive traffic signal control

导出

摘要 Realising adaptive traffic signal control(ATSC)through reinforcement learning(RL)is an important means to easetraffic congestion.This paper finds the computing power of the central processing unit(CPU)cannot fully usedwhen Simulation of Urban MObility(SUMO)is used as an environment simulator for RL.We propose a multi-process framework under value-basedRL.First,we propose a shared memory mechanism to improve exploration efficiency.Second,we use the weight sharing mechanism to solve the problem of asynchronous multi-process agents.We also explained the reason shared memory in ATSC does not lead to early local optima of the agent.Wehave verified in experiments the sampling efficiency of the 10-process method is 8.259 times that of the single process.The sampling efficiency of the 20-process method is 13.409 times that of the single process.Moreover,the agent can also converge to the optimal solution.

作者 Jie Cao Dailin Huang Liang Hou Jialin Ma

机构地区 College of Computer and Communication Engineering Research Center of Manufacturing Information of Gansu Province

出处《Journal of Control and Decision》 EI 2023年第2期229-236,共8页 控制与决策学报（英文）

基金 Gansu Education Department:[Grant Number 2021CXZX-515] National Natural Science Foundation of China:[Grant Number 61763028].

关键词 Adaptive traffic signal control Simulation of Urban MObility MULTI-PROCESS reinforcement learning value-based

分类号 TN9 [电子电信—信息与通信工程]

引文网络
相关文献

1罗美露,余磊,张海剑.引入多状态记忆机制的迭代软阈值学习算法[J].信号处理,2021,37(4):640-649. 被引量：1
2Yu Du,Wei ShangGuan,Linguo Chai.Traffic signal control in mixed traffic environment based on advance decision and reinforcement learning[J].Transportation Safety and Environment,2022,4(4):96-106.
3石伟,邰强,夏先海,仪明东,解令海,范曲立,汪联辉,魏昂,黄维.Unipolar Resistive Switching Effects Based on Al/ZnO/P^(++)-Si Diodes for Nonvolatile Memory Applications[J].Chinese Physics Letters,2012,29(8):205-208.
4Mingkai Qi,Liye Zhang.Online 3D Packing Problem Based on Bi-Value Guidance[J].Journal of Computer and Communications,2023,11(7):156-173.
5Karan S. Surana,Sri Sai Charan Mathi.Thermodynamic Consistency of Plate and Shell Mathematical Models in the Context of Classical and Non-Classical Continuum Mechanics and a Thermodynamically Consistent New Thermoelastic Formulation[J].American Journal of Computational Mathematics,2020,10(2):167-220. 被引量：3
6Wei Zhang,Bin Ji,Ping He,Nanqin Wang,Yuwei Wang,Mengzhe Zhang.Reactive Power Flow Convergence Adjustment Based on Deep Reinforcement Learning[J].Energy Engineering,2023,120(9):2177-2192.
7E.Gokulakannan.DeepQ Based Automated Irrigation Systems Using Deep Belief WSN[J].Intelligent Automation & Soft Computing,2023(3):3415-3427.
8张格.基于改进蚁群算法的城市交通路径规划研究[J].信息技术与信息化,2023(7):195-198.
9Jinchao Huang.A novel residual shrinkage block-based convolutional neural network for improving the recognition of motor imagery EEG signals[J].International Journal of Intelligent Computing and Cybernetics,2023,16(3):420-442.
10张军,卞云豪,刘克非,张建祥.基于SUMO仿真的交叉口信号配时优化研究[J].黑龙江工业学院学报（综合版）,2023,23(6):87-93. 被引量：1

Journal of Control and Decision

2023年第2期

浏览历史

内容加载中请稍等...

A multi process value-based reinforcement learning environment framework for adaptive traffic signal control

相关作者

相关机构

相关主题

浏览历史