摘要
从智能交通系统中收集到的交通数据集,往往会因为诸多因素不可避免地产生数据丢失的问题。针对此问题,提出一种贝叶斯对数正态分布张量分解插补算法。将一般的矩阵分解扩展到高阶的张量维度上,保存了数据的原本结构;利用贝叶斯推断,对一组服从对数正态分布的随机数进行循环迭代,逐一将参数的似然估计和先验项结合得到后验公式;通过马尔可夫链蒙特卡洛算法(MCMC)得到Gibbs采样模型。选用在中国广州收集的时空交通速度数据集,将其分别变成二阶、三阶和四阶张量进行对比处理,并评估该算法的性能。结果表明,该算法相较其他方法在处理三阶张量数据上可以表现出更优的数据插补性能。
The traffic dataset collected from the intelligent transportation system often inevitably has the problem of data loss due to many factors.A Bayesian lognormal distribution tensor CP decomposition interpolation algorithm is proposed to solve this problem.The general matrix decomposition was extended to the higher order dimension,and the original structure of the data was preserved.Using Bayesian inference,a set of random numbers obeying the lognormal distribution was iteratively iterated.The posterior formula of the parameter was combined with the prior term to obtain the posterior formula.The Gibbs sampling model was obtained by Markov chain Monte Carlo algorithm(MCMC).The dataset was selected from the spatiotemporal traffic velocity dataset collected in Guangzhou,China,and transformed them into second-order,third-order and fourth-order tensors,and the performance of our algorithm was evaluated.The results show that the model can obtain more accurate data interpolation performance than other methods in processing third-order tensor data.
作者
李小沛
李凡长
梁合兰
Li Xiaopei;Li Fanzhang;Liang Helan(School of Computer Science and Technology,Soochow University,Suzhou 215006,Jiangsu,China)
出处
《计算机应用与软件》
北大核心
2021年第7期214-221,共8页
Computer Applications and Software
基金
国家自然科学基金项目(61672364,61672365,61902269)
国家重点研发计划项目(2018YFA07070,2018YFA0701701)。
关键词
对数正态分布
智能交通
张量CP分解
贝叶斯推断
时空交通数据插补
Lognormal distribution
Intelligent traffic
Tensor CP decomposition
Bayesian inference
Space-time traffic data imputation