摘要
考虑合流区域主线和匝道的交通流运行状态,提出了一种基于深度强化学习的鲁棒自适应匝道控制模型——DRLARM模型。根据交通流运行特征,构造了平衡主线交通效率和匝道排队长度的强化学习奖励函数;为适应动态变化的交通环境,采用多交通流场景混合训练控制模型,在不同拥堵成因、不同拥堵时长、不同需求分布等测试场景下开展仿真实验,对比分析了无控制及DRLARM、ALINEA和PI-ALINEA模型控制的车辆平均行程时间A、车道占有率o、匝道排队长度W和匝道损失时间比P等评价指标。研究表明:DRLARM模型控制的平均行程时间A相比无控工况节省了22%,略好于ALINEA模型,与PI-ALINEA模型控制效果相当;DRLARM模型在不同测试场景下产生的匝道损失时间比P较稳定,匝道排队长度W绝对值相较于ALINEA模型和PI-ALINEA模型均缩短了约16%;深度强化学习方法兼顾了通行效率和路权公平性,训练所得DRLARM模型在动态交通条件下表现出良好的鲁棒性。
Considering the traffic flow conditions of both mainline and ramp in ramp merging areas,a robust adaptive ramp metering model named Deep Reinforcement Learning-Based Adaptive Ramp Metering(DRLARM) based on deep reinforcement learning was proposed.According to traffic flow operation characteristics,a reinforcement learning reward function balancing mainline traffic efficiency and ramp queue length was constructed.To adapt to the dynamically changing traffic environment,a mixed training control model with multiple traffic flow scenarios was adopted,and simulation experiments were conducted under test scenarios such as different congestion causes,different congestion duration and different demand distribution.The average travel time A,lane occupancy ratio o,ramp queue length W and ramp loss time radio P were compared and analyzed in the case of uncontrolled,DRLARM,ALIENA,and PI-ALINEA models.The research shows that the average travel time A controlled by the DRLARM model has been saved by 22% compared to the uncontrolled working condition,slightly better than the ALIENA model,and has a similar control effect as the PI-ALINEA model does.In addition,the ramp loss time ratio P generated by the DRLARM model in different testing scenarios is relatively stable and the absolute value of ramp queue length W is shortened by about 16%,compared with the that of ALIENA model and PI-ALINEA model.The deep reinforcement learning method has taken into account both traffic efficiency and right-of-way fairness,and the trained DRLARM model exhibits good robustness under dynamic traffic conditions.
作者
章立辉
余宏鑫
熊满初
胡文琴
王亦兵
ZHANG Lihui;YU Hongxin;XIONG Manchu;HU Wenqin;WANG Yibing(Institute of Intelligent Transportation Systems,College of Civil Engineering and Architecture,Zhejiang University,Hangzhou 310058,Zhejiang,China;Architectural Design and Research Institute Co.,Ltd.,Zhejiang University,Hangzhou 310014,Zhejiang,China;Research Center for Balance Architecture,Zhejiang University,Hangzhou 310014,Zhejiang,China)
出处
《重庆交通大学学报(自然科学版)》
CAS
CSCD
北大核心
2023年第4期87-97,107,共12页
Journal of Chongqing Jiaotong University(Natural Science)
基金
国家重点研发计划项目(2018YFB1600500)
浙江省重点研发计划项目(2021C01012)。
关键词
交通工程
自适应匝道控制
深度强化学习
高速公路
匝道排队管理
鲁棒性
traffic engineering
adaptive ramp metering
deep reinforcement learning
freeway
ramp queue management
robustness