摘要
为了有效填充不完整的公交到站时间信息,提出了一种基于改进k~*-means算法的不完整到站时间的填充方法.根据到站流动人数、到站所属时段、站点间距离、站点间运行时间特征加权度量站点间相似性,对现有kmeans算法进行改进以构建公交站点间运行时间完备信息表.以北京市地面公交运行数据为例,验证了该方法的可靠性,并与线性拟合、最近邻插值、k-means算法等填充方法进行了对比试验.结果表明:该方法对不完整到站时间的填充率高于97%,且对已知到站时间平均填充误差不高于100 s.
To effectively impute incomplete bus arrival time,an impraed k*-means clustering algorithm was proposed in this paper.Four kinds of features were firstly extracted from historical travel records,such as travel distance,passenger numbers,time period and travel time.Then an improved k*-means algorithm was developed to cluster these features,and a complete dictionary was constructed on travel time between stations,based on which the arrival time between any2stations could be imputed indirectly.Empirical data of bus transit route in Beijing were used to validate the effectiveness of the proposed algorithm.Furthermore,3kinds of typical imputation method of linear regression,k-nearest neighbors,and k-means clustering were adopted for result comparison.Experimental results demonstrate that the imputation proportion by this method is over97%,the highest among the4methods.Moreover,the average imputation error is no higher than100seconds,which proves the effectiveness of the method.
作者
赵霞
张勇
尹宝才
刘浩
张可
ZHAO Xia;ZHANG Yong;YIN Baocai;LIU Hao;ZHANG Ke(Multimedia and Intelligent Software Technology Laboratory,Beijing University of Technology,Beijing 100124,China;Beijing Transportation Information Center,Beijing 100073,China;Beijing Transportation Operations Coordination Center,Beijing 100073,China)
出处
《北京工业大学学报》
CAS
CSCD
北大核心
2018年第1期135-143,共9页
Journal of Beijing University of Technology
基金
北京市科学技术委员会资助项目(Z171100000517003
Z171100000517004)
北京市教育委员会资助项目(KM201610005033)