Realising adaptive traffic signal control(ATSC)through reinforcement learning(RL)is an important means to easetraffic congestion.This paper finds the computing power of the central processing unit(CPU)cannot fully use...Realising adaptive traffic signal control(ATSC)through reinforcement learning(RL)is an important means to easetraffic congestion.This paper finds the computing power of the central processing unit(CPU)cannot fully usedwhen Simulation of Urban MObility(SUMO)is used as an environment simulator for RL.We propose a multi-process framework under value-basedRL.First,we propose a shared memory mechanism to improve exploration efficiency.Second,we use the weight sharing mechanism to solve the problem of asynchronous multi-process agents.We also explained the reason shared memory in ATSC does not lead to early local optima of the agent.Wehave verified in experiments the sampling efficiency of the 10-process method is 8.259 times that of the single process.The sampling efficiency of the 20-process method is 13.409 times that of the single process.Moreover,the agent can also converge to the optimal solution.展开更多
The importance of using adaptive traffic signal control for figuring out the unpredictable traffic congestion in today's metropolitan life cannot be overemphasized. The vehicular ad hoc network(VANET), as an integ...The importance of using adaptive traffic signal control for figuring out the unpredictable traffic congestion in today's metropolitan life cannot be overemphasized. The vehicular ad hoc network(VANET), as an integral component of intelligent transportation systems(ITSs), is a new potent technology that has recently gained the attention of academics to replace traditional instruments for providing information for adaptive traffic signal controlling systems(TSCSs). Meanwhile, the suggestions of VANET-based TSCS approaches have some weaknesses:(1) imperfect compatibility of signal timing algorithms with the obtained VANET-based data types, and(2) inefficient process of gathering and transmitting vehicle density information from the perspective of network quality of service(Qo S). This paper proposes an approach that reduces the aforementioned problems and improves the performance of TSCS by decreasing the vehicle waiting time, and subsequently their pollutant emissions at intersections. To achieve these goals, a combination of vehicle-to-vehicle(V2V) and vehicle-to-infrastructure(V2I) communications is used. The V2 V communication scheme incorporates the procedure of density calculation of vehicles in clusters, and V2 I communication is employed to transfer the computed density information and prioritized movements information to the road side traffic controller. The main traffic input for applying traffic assessment in this approach is the queue length of vehicle clusters at the intersections. The proposed approach is compared with one of the popular VANET-based related approaches called MC-DRIVE in addition to the traditional simple adaptive TSCS that uses the Webster method. The evaluation results show the superiority of the proposed approach based on both traffic and network Qo S criteria.展开更多
Reinforcement learning-based traffic signal control systems (RLTSC) can enhance dynamic adaptability, save vehicle travelling timeand promote intersection capacity. However, the existing RLTSC methods do not consider ...Reinforcement learning-based traffic signal control systems (RLTSC) can enhance dynamic adaptability, save vehicle travelling timeand promote intersection capacity. However, the existing RLTSC methods do not consider the driver’s response time requirement, sothe systems often face efficiency limitations and implementation difficulties.We propose the advance decision-making reinforcementlearning traffic signal control (AD-RLTSC) algorithm to improve traffic efficiency while ensuring safety in mixed traffic environment.First, the relationship between the intersection perception range and the signal control period is established and the trust region state(TRS) is proposed. Then, the scalable state matrix is dynamically adjusted to decide the future signal light status. The decision will bedisplayed to the human-driven vehicles (HDVs) through the bi-countdown timer mechanism and sent to the nearby connected automatedvehicles (CAVs) using the wireless network rather than be executed immediately. HDVs and CAVs optimize the driving speedbased on the remaining green (or red) time. Besides, the Double Dueling Deep Q-learning Network algorithm is used for reinforcementlearning training;a standardized reward is proposed to enhance the performance of intersection control and prioritized experiencereplay is adopted to improve sample utilization. The experimental results on vehicle micro-behaviour and traffic macro-efficiencyshowed that the proposed AD-RLTSC algorithm can simultaneously improve both traffic efficiency and traffic flow stability.展开更多
Optimization of adaptive traffic signal timing is one of the most complex problems in traffic control systems. This paper presents an adaptive transit signal priority (TSP) strategy that applies the parallel genetic...Optimization of adaptive traffic signal timing is one of the most complex problems in traffic control systems. This paper presents an adaptive transit signal priority (TSP) strategy that applies the parallel genetic algorithm (PGA) to optimize adaptive traffic signal control in the presence of TSP. The method can optimize the phase plan, cycle length, and green splits at isolated intersections with consideration for the performance of both the transit and the general vehicles. A VISSIM (VISual SIMulation) simulation testbed was developed to evaluate the performance of the proposed PGA-based adaptive traffic signal control with TSP. The simulation results show that the PGA-based optimizer for adaptive TSP outperformed the fully actuated NEMA control in all test cases. The results also show that the PGA-based optimizer can produce TSP timing plans that benefit the transit vehicles while minimizing the impact of TSP on the general vehicles.展开更多
基金Gansu Education Department:[Grant Number 2021CXZX-515]National Natural Science Foundation of China:[Grant Number 61763028].
文摘Realising adaptive traffic signal control(ATSC)through reinforcement learning(RL)is an important means to easetraffic congestion.This paper finds the computing power of the central processing unit(CPU)cannot fully usedwhen Simulation of Urban MObility(SUMO)is used as an environment simulator for RL.We propose a multi-process framework under value-basedRL.First,we propose a shared memory mechanism to improve exploration efficiency.Second,we use the weight sharing mechanism to solve the problem of asynchronous multi-process agents.We also explained the reason shared memory in ATSC does not lead to early local optima of the agent.Wehave verified in experiments the sampling efficiency of the 10-process method is 8.259 times that of the single process.The sampling efficiency of the 20-process method is 13.409 times that of the single process.Moreover,the agent can also converge to the optimal solution.
基金Project supported by the UM High Impact Research MoE Grant from the Ministry of Education,Malaysia(No.UM.C/625/1/HIR/MOHE/FCSIT/09)
文摘The importance of using adaptive traffic signal control for figuring out the unpredictable traffic congestion in today's metropolitan life cannot be overemphasized. The vehicular ad hoc network(VANET), as an integral component of intelligent transportation systems(ITSs), is a new potent technology that has recently gained the attention of academics to replace traditional instruments for providing information for adaptive traffic signal controlling systems(TSCSs). Meanwhile, the suggestions of VANET-based TSCS approaches have some weaknesses:(1) imperfect compatibility of signal timing algorithms with the obtained VANET-based data types, and(2) inefficient process of gathering and transmitting vehicle density information from the perspective of network quality of service(Qo S). This paper proposes an approach that reduces the aforementioned problems and improves the performance of TSCS by decreasing the vehicle waiting time, and subsequently their pollutant emissions at intersections. To achieve these goals, a combination of vehicle-to-vehicle(V2V) and vehicle-to-infrastructure(V2I) communications is used. The V2 V communication scheme incorporates the procedure of density calculation of vehicles in clusters, and V2 I communication is employed to transfer the computed density information and prioritized movements information to the road side traffic controller. The main traffic input for applying traffic assessment in this approach is the queue length of vehicle clusters at the intersections. The proposed approach is compared with one of the popular VANET-based related approaches called MC-DRIVE in addition to the traditional simple adaptive TSCS that uses the Webster method. The evaluation results show the superiority of the proposed approach based on both traffic and network Qo S criteria.
基金Science&Technology Research and Development Program of China Railway(Grant No.N2021G045)the Beijing Municipal Natural Science Foundation(Grant No.L191013)the Joint Funds of the Natural Science Foundation of China(Grant No.U1934222).
文摘Reinforcement learning-based traffic signal control systems (RLTSC) can enhance dynamic adaptability, save vehicle travelling timeand promote intersection capacity. However, the existing RLTSC methods do not consider the driver’s response time requirement, sothe systems often face efficiency limitations and implementation difficulties.We propose the advance decision-making reinforcementlearning traffic signal control (AD-RLTSC) algorithm to improve traffic efficiency while ensuring safety in mixed traffic environment.First, the relationship between the intersection perception range and the signal control period is established and the trust region state(TRS) is proposed. Then, the scalable state matrix is dynamically adjusted to decide the future signal light status. The decision will bedisplayed to the human-driven vehicles (HDVs) through the bi-countdown timer mechanism and sent to the nearby connected automatedvehicles (CAVs) using the wireless network rather than be executed immediately. HDVs and CAVs optimize the driving speedbased on the remaining green (or red) time. Besides, the Double Dueling Deep Q-learning Network algorithm is used for reinforcementlearning training;a standardized reward is proposed to enhance the performance of intersection control and prioritized experiencereplay is adopted to improve sample utilization. The experimental results on vehicle micro-behaviour and traffic macro-efficiencyshowed that the proposed AD-RLTSC algorithm can simultaneously improve both traffic efficiency and traffic flow stability.
文摘Optimization of adaptive traffic signal timing is one of the most complex problems in traffic control systems. This paper presents an adaptive transit signal priority (TSP) strategy that applies the parallel genetic algorithm (PGA) to optimize adaptive traffic signal control in the presence of TSP. The method can optimize the phase plan, cycle length, and green splits at isolated intersections with consideration for the performance of both the transit and the general vehicles. A VISSIM (VISual SIMulation) simulation testbed was developed to evaluate the performance of the proposed PGA-based adaptive traffic signal control with TSP. The simulation results show that the PGA-based optimizer for adaptive TSP outperformed the fully actuated NEMA control in all test cases. The results also show that the PGA-based optimizer can produce TSP timing plans that benefit the transit vehicles while minimizing the impact of TSP on the general vehicles.