期刊文献+

平衡主线和匝道交通运行的强化学习型匝道控制研究

Reinforcement Learning Ramp Metering to Balance Mainline and Ramp Traffic Operations
下载PDF
导出
摘要 考虑合流区域主线和匝道的交通流运行状态,提出了一种基于深度强化学习的鲁棒自适应匝道控制模型——DRLARM模型。根据交通流运行特征,构造了平衡主线交通效率和匝道排队长度的强化学习奖励函数;为适应动态变化的交通环境,采用多交通流场景混合训练控制模型,在不同拥堵成因、不同拥堵时长、不同需求分布等测试场景下开展仿真实验,对比分析了无控制及DRLARM、ALINEA和PI-ALINEA模型控制的车辆平均行程时间A、车道占有率o、匝道排队长度W和匝道损失时间比P等评价指标。研究表明:DRLARM模型控制的平均行程时间A相比无控工况节省了22%,略好于ALINEA模型,与PI-ALINEA模型控制效果相当;DRLARM模型在不同测试场景下产生的匝道损失时间比P较稳定,匝道排队长度W绝对值相较于ALINEA模型和PI-ALINEA模型均缩短了约16%;深度强化学习方法兼顾了通行效率和路权公平性,训练所得DRLARM模型在动态交通条件下表现出良好的鲁棒性。 Considering the traffic flow conditions of both mainline and ramp in ramp merging areas,a robust adaptive ramp metering model named Deep Reinforcement Learning-Based Adaptive Ramp Metering(DRLARM) based on deep reinforcement learning was proposed.According to traffic flow operation characteristics,a reinforcement learning reward function balancing mainline traffic efficiency and ramp queue length was constructed.To adapt to the dynamically changing traffic environment,a mixed training control model with multiple traffic flow scenarios was adopted,and simulation experiments were conducted under test scenarios such as different congestion causes,different congestion duration and different demand distribution.The average travel time A,lane occupancy ratio o,ramp queue length W and ramp loss time radio P were compared and analyzed in the case of uncontrolled,DRLARM,ALIENA,and PI-ALINEA models.The research shows that the average travel time A controlled by the DRLARM model has been saved by 22% compared to the uncontrolled working condition,slightly better than the ALIENA model,and has a similar control effect as the PI-ALINEA model does.In addition,the ramp loss time ratio P generated by the DRLARM model in different testing scenarios is relatively stable and the absolute value of ramp queue length W is shortened by about 16%,compared with the that of ALIENA model and PI-ALINEA model.The deep reinforcement learning method has taken into account both traffic efficiency and right-of-way fairness,and the trained DRLARM model exhibits good robustness under dynamic traffic conditions.
作者 章立辉 余宏鑫 熊满初 胡文琴 王亦兵 ZHANG Lihui;YU Hongxin;XIONG Manchu;HU Wenqin;WANG Yibing(Institute of Intelligent Transportation Systems,College of Civil Engineering and Architecture,Zhejiang University,Hangzhou 310058,Zhejiang,China;Architectural Design and Research Institute Co.,Ltd.,Zhejiang University,Hangzhou 310014,Zhejiang,China;Research Center for Balance Architecture,Zhejiang University,Hangzhou 310014,Zhejiang,China)
出处 《重庆交通大学学报(自然科学版)》 CAS CSCD 北大核心 2023年第4期87-97,107,共12页 Journal of Chongqing Jiaotong University(Natural Science)
基金 国家重点研发计划项目(2018YFB1600500) 浙江省重点研发计划项目(2021C01012)。
关键词 交通工程 自适应匝道控制 深度强化学习 高速公路 匝道排队管理 鲁棒性 traffic engineering adaptive ramp metering deep reinforcement learning freeway ramp queue management robustness
  • 相关文献

参考文献6

二级参考文献38

  • 1张海军,杨晓光,张珏.高速道路入口匝道控制方法综述[J].同济大学学报(自然科学版),2005,33(8):1051-1055. 被引量:23
  • 2魏英姿 ,赵明扬 .一种基于强化学习的作业车间动态调度方法[J].自动化学报,2005,31(5):765-771. 被引量:19
  • 3任黎立.高速道路入口匝道控制方法综述[J].交通标准化,2006,34(5):146-149. 被引量:14
  • 4高阳,周如益,王皓,曹志新.平均奖赏强化学习算法研究[J].计算机学报,2007,30(8):1372-1378. 被引量:38
  • 5Papageorgiou M, Kotsialos A. Freeway ramp metering: an overview [ J ]. IEEE Transactions on Intelligent Transportation Systems, 2002, 3 (4) : 271 - 281.
  • 6Hou Z S, Xu J X. Freeway traffic density control using iterative learning control approach [ C ]. The IEEE 6th International Conference on Intelligent Transportation Systems, Shanghai, China, 2003, 2:1081 -1086.
  • 7Hou Z S, Xu J X, Yan J W. An iterative learning approach for density control of freeway traffic flow via ramp metering [ J ]. Transportation Research Part C, 2008, 16(1): 71 -97.
  • 8Hou Z S, Xu J X, Zhong H W. Freeway traffic control using iterative learning control-based ramp metering and speed signaling [ J ]. IEEE Transactions on Vehicular Technology, 2007, 56 (2) : 466 - 477.
  • 9Parageorgiou M, Blosseville J M, Hadj-Salem H. Mac- roscopic modeling of traffic flow on the Boulevard Pe- ripherique in Paris [ J ]. Transportation Research Part B, 1989, 23(1): 29-47.
  • 10Sun M X, Wang D W. Initial shift issues on discrete- time iterative learning control with system relative degree [ J]. IEEE Transactions on Automatic Control, 2003, 48(1): 144-148.

共引文献536

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部