摘要
为解决货运列车在长大下坡区段进行循环制动调速时,不合理的制动与缓解时机容易导致列车运行不安全的问题,文章以SS6B型电力机车牵引C80货车作为研究对象,建立基于质量带的列车动力学模型,将列车运行效率、运行安全性以及闸瓦磨损作为优化目标,综合考虑区段限速和制动缸充风时间约束,提出基于深度Q网络(deepQ-network,DQN)的列车长大下坡优化运行曲线智能生成算法,并通过与环境交互搜寻最优循环制动工况转换点。其利用经验回放和双网络机制对训练样本进行批采样,并通过对神经网络状态输入进行预处理;采用变ε-greedy策略对动作空间可行域进行探索,构建基于值函数的损失函数;通过批梯度下降方法对网络参数进行迭代更新。利用Matlab搭建仿真测试环境,仿真结果表明:通过随机生成入坡速度对列车长大下坡运行任务进行训练,累积奖励随训练次数逐渐收敛,验证了该算法的收敛性和泛化性;训练完成后,不同入坡速度下生成的优化运行曲线能够控制列车在达到限速之前施加空气制动,并在充风结束后缓解,保证了列车安全、高效运行,进一步验证了算法的有效性;另外,通过对不同学习率以及不同网络输入预处理后分布范围的平均累积奖励曲线对比,验证了该算法能够提升收敛速度和稳定性。该研究结果为进一步优化货运列车长大下坡区段运行曲线生成方法、保障列车运行效率和安全提供了参考。
Freight trains running in long steep downhill sections require speed regulation through cycle braking.However,improper braking application and release timing can pose significant safety risks in train operation.Taking SS6B electric locomotive pulling C80 freight car as the research object,the train dynamics model based on mass belt is established.This study proposes a deep Q-network(DQN)based intelligent curve generation algorithm for operational optimization in these sections.This algorithm incorporates train operational efficiency,safety,and brake shoe wear as optimization objectives,and considers speed limits and charging time constraints for brake cylinders,enabling the search for optimal transition points in cycle braking conditions through interactions with the environment.The study employed the batch collection of training samples utilizing experience replay and a double-network mechanism,along with the preprocessing of neural network state inputs,and the investigation into feasible regions within the action space using a variableε-greedy strategy.A loss function based on the value function was then constructed,and network parameters were updated iteratively by a batch gradient descent method.Results from simulations conducted in environments set up using Matlab showed that in the task training of train operation on long steep downhill with randomly generated entry speeds,cumulative rewards gradually converged over training runs,which verified the convergence and generalization of the proposed algorithm.The optimized operational curves generated with various entry speeds at the completion of training,effectively controlled the trains to apply air braking before reaching the speed limits and to release braking at the end of air charging,which verified the efficacy of the algorithm in ensuring the safety and efficiency of train operation.In addition,by comparing average cumulative reward curves for different learning rates and distribution ranges after preprocessing of different network inputs,the algorithm was further verified capable in improving convergence speeds and stability.The research results provide a reference for further optimizing the generation of operational curves for freight trains running in long steep downhill sections,thereby ensuring both train operational efficiency and safety.
作者
何之煜
李一楠
李辉
吉志军
HE Zhiyu;LI Yinan;LI Hui;JI Zhijun(Signal&Communication Research Institute,China Academy of Railway Sciences Co.,Ltd.,Beijing 100081,China)
出处
《控制与信息技术》
2024年第4期19-27,共9页
CONTROL AND INFORMATION TECHNOLOGY
基金
中国铁道科学研究院集团有限公司基金项目(2022HT15,2023YJ273)。
关键词
货运列车
长大下坡
运行曲线
深度Q网络
神经网络
输入预处理
freight train
long steep downhill
operational curve
deep Q-network(DQN)
neural network
input pretreatment