摘要
对各站客票收入率进行分类并预测各路局收入,对于运输部门制定收入计划、高铁列车开行计划等都具有一定指导意义。本文基于中国铁路总公司下发的客运运行图中直通高铁列车分站信息,首先提出一种Kmeans-CACC方法对客票收入率进行离散化处理,克服了无监督离散化的缺点,利用CACC算法对影响因素数据进行离散化处理;然后,设计一个基于误差区间交集和样本密度的最优收入率选取方法,找出各个分类中满足误差范围的收入率;最后,利用随机森林算法对客票收入率进行分类,并预测各路局收入。实验结果表明,本文提出的离散化算法及分类算法能够对收入率进行精确度较高的分类,提出的收入率选取算法能较好地预测收入。
It is significant that managers design revenue plan and develop high-speed train operate plan in the transport sector through classification of ticket revenue rate and prediction of revenue for railway bureaus.Based on the information of substation of high-speed railway trains in the passenger operation plan,which is designed by China Railway.In this paper,firstly,a K-means-CACC method is designed to discretize the ticket revenue rate,which overcomes the shortcomings of unsupervised discretization algorithm.The CACC algorithm is used to discretize the factor data.Secondly,an optimal income rate selection algorithm based on error interval intersection and sample density is designed to find out the revenue rate of each category to meet the error range.Finally,the random forest algorithm is used to classify the ticket revenue rate and forecast the revenue of each railway bureau.The experimental results show that the discretization algorithm and the classification algorithm proposed in this paper can classify the revenue rate with high accuracy,and the proposed revenue rate selection algorithm can forecast the ticket revenue well.
出处
《铁道学报》
EI
CAS
CSCD
北大核心
2018年第3期23-28,共6页
Journal of the China Railway Society
基金
中国铁路总公司科技研究开发计划(2016X005-A
2017X004-C
J2016X005)
中国铁道科学研究院科研项目(2016YJ100
2016YJ108)