利用A2C-ac的城轨车车通信资源分配算法

Resource Allocation Algorithm of Urban Rail Train-to-Train Communication with A2C-ac

下载PDF

导出

摘要在城市轨道交通列车控制系统中,车车(T2T)通信作为新一代列车通信模式,利用列车间直接通信来降低通信时延,提高列车运行效率。在T2T通信与车地(T2G)通信并存场景下,针对复用T2G链路产生的干扰问题,在保证用户通信质量的前提下,该文提出一种基于多智能体深度强化学习(MADRL)的改进优势演员-评论家(A2C-ac)资源分配算法。首先以系统吞吐量为优化目标,以T2T通信发送端为智能体,策略网络采用分层输出结构指导智能体选择需复用的频谱资源和功率水平,然后智能体做出相应动作并与T2T通信环境交互,得到该时隙下T2G用户和T2T用户吞吐量,价值网络对两者分别评价,利用权重因子β为每个智能体定制化加权时序差分(TD)误差,以此来灵活优化神经网络参数。最后,智能体根据训练好的模型联合选出最佳的频谱资源和功率水平。仿真结果表明,该算法相较于A2C算法和深度Q网络(DQN)算法,在收敛速度、T2T成功接入率、吞吐量等方面均有明显提升。 In the train control system of urban rail transit,Train-to-Train(T2T)communication,a new train communication mode,use direct communication between trains to reduce communication delay and improve train operation efficiency.In the scenario of the coexistence of T2T communication and Train to Ground(T2G)communication,an improved Advantage Actor-Critic-ac(A2C-ac)resource allocation algorithm based on Multi-Agent Deep Reinforcement Learning(MADRL)is proposed to solve the interference problem caused by multiplexing T2G links,and under the premise of ensuring the quality of user communication.Firstly,taking the system throughput as the optimization goal and the T2T communication transmitter as the agent,the policy network adopts a hierarchical output structure to guide the agent in selecting the spectrum resources and power level to be reused.Then the agent makes corresponding actions and interacts with the communication environment to obtain the throughput of T2G users and T2T users in the time slot.The value networkβevaluates the two separately and uses the weight factor to customize the weighted Temporal Difference(TD)error for each agent to optimize the neural network parameters flexibly.Finally,the agents jointly select the best spectral resources and power levels according to the trained model.The simulation results show that compared with the A2C and Deep Q-Networks(DQN)algorithms,the proposed algorithm has significantly improved the convergence speed,T2T successful access rate,and the throughput.

作者王瑞峰张明黄子恒何涛 WANG Ruifeng;ZHANG Ming;HUANG Ziheng;HE Tao(School of Automation and Electrical Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;Automatic Control Institute,Lanzhou Jiaotong University,Lanzhou 730070,China)

机构地区兰州交通大学自动化与电气工程学院兰州交通大学自动控制研究所

出处《电子与信息学报》 EI CAS CSCD 北大核心 2024年第4期1306-1313,共8页 Journal of Electronics & Information Technology

基金国家自然科学基金铁路基础研究联合基金(U2268206)。

关键词城市轨道交通资源分配 T2T通信多智能体深度强化学习 A2C-ac算法 Urban rail transit system Resource allocation Train-to-Train(T2T) Multi-Agent Deep Reinforcement Learning(MADRL) Advantage Actor-Critic-ac(A2C-ac)algorithm

分类号 TN929.5 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献8

1刘伟,郑润泽,张磊,高梓贺,陶滢,崔楷欣.基于A2C算法的低轨星座动态波束资源调度研究[J].中国空间科学技术,2023,43(3):123-133. 被引量：1
2唐伦,贺小雨,王晓,谭颀,胡彦娟,陈前斌.基于异步优势演员-评论家学习的服务功能链资源分配算法[J].电子与信息学报,2021,43(6):1733-1741. 被引量：8
3赵军辉,陈垚,张青苗.基于深度强化学习的车车通信智能频谱共享[J].铁道科学与工程学报,2022,19(3):841-848. 被引量：3
4陈垚,赵军辉,张青苗,周天清.车车通信中通信模式选择与资源分配算法[J].计算机工程与应用,2022,58(10):93-100. 被引量：4
5高云波,程璇,李翠然,田智愚,王国荣.T2T和T2G混合网络中的功率分配算法[J].西南交通大学学报,2023,58(5):1126-1134. 被引量：1
6申滨,孙万平,张楠,崔太平.基于加权二部图及贪婪策略的蜂窝网络D2D通信资源分配[J].电子与信息学报,2023,45(3):1055-1064. 被引量：8
7胡雪旸,周庆华.基于D2D的列控系统车车通信资源分配算法[J].铁道标准设计,2019,63(3):153-157. 被引量：3
8林俊亭,王晓明,党垚,曹岩.城市轨道交通列车碰撞防护系统设计与研究[J].铁道科学与工程学报,2015,12(2):407-413. 被引量：13

二级参考文献44

1FAA. Introductions to TCAS II [ EB/OL]. http://www. faa. gov/documentLibrary/media/Advisory _ Circular/ CAS% 20II% 20V7. 1% 20Intro% 20booklet. pdf, 2014 - 09 - 05.
2Federal aviation administration, automatic dependent sur- veillance - broadcast ( ADS - B) [ EB/OL ]. http ://www. faa. gov/nextgen/implementation/programs/adsb/, 2014 - 09 - 05.
3U S. Department of homeland security navigation center. Automatic Identification System Overview [ EB/OL ]. ht- tp ://www. navcen, uscg. gov/? pageName = AISmain, 2014 -09.
4C2C - CC technical committee. C2C - CC manifesto [ EB/ OL]. http ://www. car - to - car. org,2014 - 05.
5Cristina Rico Garcia, Andres Lehner, Thomas Strang, et al. Comparison of collision avoidance systems and applica- bility to rail transport[ C ]//Proceedings of the 7th Inter- national Conference on Intelligent Transportation System Telecommunication ,2007 : 1 - 6.
6Andreas Lehner, Cristina Rico Garcia, Wige Eugen, et al. A multi -broadcast communication system for highdynamic vehicular ad - hoc networks [ C ]//Proceedings of the ICUMT 2009 and IEEE International Workshop on Commnication Technologies for Vehicles ,2009 : 1 - 6.
7Cristina Rico Garcia, Lehner Andreas, Thomas Strang. COMB : Cell - based orientation aware MANET broadcast MAC layer [ C ]//Procedings of the IEEE Global Com- munications Conference ,2008 : 1 - 5.
8Cristina Rico Garcia, Lehner Andreas, Thomas Strang, et al. A reliable MAC protocol for broadcast VANETs [ C]//Proceedings of the 4'h Workshop on Vehicle to Vehicle Communications ,2008 : 1 - 8.
9Gerlach K, Rahmig C. Multi - hypothesis based map - matching algorithm for precise train positioning [ C ]/! Proceedings of the 12th International Conference on In- formation Fusion ,2009 : 1363 - 1369.
10刘海东,苏梅,彭宏勤,张增勇,邢海龙.城市轨道交通列车制动问题研究[J].交通运输系统工程与信息,2011,11(6):93-97. 被引量：13

共引文献32

1沈拓,邓奇,张玮玮.基于无线传感器网络的地铁列车防追尾方法研究[J].科技视界,2016(5):12-13.
2陈启香.基于列车间直接通信技术的避撞系统研究[J].电子设计工程,2016,24(16):134-136.
3王迪,陈光武,杨厅.BDS/GPS双模定位技术在现代有轨电车中的应用研究[J].铁道科学与工程学报,2016,13(11):2270-2275. 被引量：7
4邓奇.基于车车通信的地铁列车应急追踪预警方法研究[J].科技视界,2017(8):278-279. 被引量：3
5王丽丽,李雷.CBTC信号系统列车跟踪技术现状及发展[J].控制与信息技术,2019(3):1-6. 被引量：5
6杨玉钊,郑良广.卡尔曼滤波在列车防撞预警系统中的应用[J].电子设计工程,2020,28(18):56-59. 被引量：3
7魏伟,刘晓娟,张雁鹏,李瑶.基于IGM-BP算法的城轨越区切换研究[J].铁道标准设计,2021,65(3):164-170. 被引量：1
8赵军辉,张丹阳,贺林.智慧城轨交通通信技术的分析与展望[J].电信科学,2021,37(4):1-13. 被引量：9
9林俊亭,王海斌.基于定性微分对策的列车碰撞防护方法[J].铁道学报,2021,43(5):97-103. 被引量：3
10赵军辉,陈垚,张青苗.基于深度强化学习的车车通信智能频谱共享[J].铁道科学与工程学报,2022,19(3):841-848. 被引量：3

1刘双.基于智慧城轨车站的应急疏散系统研究[J].时代汽车,2024(7):193-195. 被引量：1
2刘恩斌,彭勇,喻斌,杨毅,李长俊.天然气长输管道低碳优化运行模型[J].油气与新能源,2024,36(2):31-40.
3马军伟,徐琛,陶洪峰,杨慧中.基于双行动者深度确定性策略梯度算法的间歇过程控制[J].信息与控制,2023,52(6):773-783.

电子与信息学报

2024年第4期

浏览历史

内容加载中请稍等...

利用A2C-ac的城轨车车通信资源分配算法

参考文献8

二级参考文献44

共引文献32

相关作者

相关机构

相关主题

浏览历史