摘要
在海量序列数据中,预测群体用户在未来一段时间中的可能行为模式是一个非常有意义且具有挑战性的研究问题。本文以公交用户群体出行为例,通过引入相空间重构法,利用海量序列数据对大型系统建立模型来模拟其动态演化模式。同时,考虑到一般相空间预测方法在大数据情况下的不足,提出了相似性拐点方法进行预测前的相似点的自动挑选工作,该方法不但降低了预测过程中的相似度计算复杂度,同时也显著提升了预测效果。实验证明,本文的方法对于探讨利用海量(周期性)序列数据进行系统建模,以及预测一段时间内的群体行为提出了新的思路。
In massive sequence data,predicting the behavioral patterns of user groups over a period of time in the future is a very meaningful research endeavor.In this field,research on the behavioral patterns of public transportation user groups is particularly representative and reflective of the main characteristics of urban residents and cities,as public transportation is the primary means by which urban residents travel.To improve the efficiency of public resource use and optimize the management of urban public transportation,it is of great significance to promote the intellectual development of urban computing.Traffic flow-related prediction research has undergone a long period of development.Previous research considered only the prediction of short-term traffic flow,however the reasonable prediction of long-term traffic flow may provide better services for traffic management.With the emergence of intelligent transportation,people expect to use public transit big data to accurately predict the travel behavior of long-term user groups.Taking the behavior of public transport user groups as an example,this paper introduces the phase space reconstruction method to predict the nature and regularity of mass transit group sequence behaviors,and uses massive sequence data to model the large-scale system to simulate its dynamic evolution process.However,the phase space reconstruction method faces two problems:one is the selection of number of similar points in the phase space;the other is the quality of the phase space reconstruction.With respect to the first problem,after the general phase space reconstruction method maps the data to the phase space,the K-proximity method is normally used to find similar points within the time frame for prediction.However,this method is sensitive to the adjacent number of values K and produces a large error.Given these flaws,this paper proposes the similarity inflection point method for the automatic selection of similar points before prediction,that is,the most similar P points are automatically selected for prediction in a large K-near neighborhood.This method not only reduces the complexity of similarity calculation in the prediction process,but also significantly improves the prediction effect.With respect to the second problem,previous studies have only evaluated the quality of phase space reconstructions through prediction effects.This paper not only measures the quality of the phase space reconstruction from the forecast result,but also compares and defines relationships between different prediction results and phase diagrams through a series of parameter experiments.The parameter experiments show that the phase diagram changes significantly under different parameters,and that there is a certain correlation between the high-quality phase diagram and high-precision prediction.This shows that the phase space reconstruction method can better describe the behavioral patterns of public transportation user groups,and shows the effectiveness of the prediction method used for the phase space reconstruction in this paper.The final experimental results show that the method in this paper has obvious advantages over other time series prediction methods.The similarity inflection point method proposed in this paper,in particular,has significantly improved the prediction accuracy.At the same time,this paper proposes new ideas for exploring the use of massive(periodic)sequence data for system modeling and predicting group behavior over a period of time.
作者
冯路
钱宇
白梦娜
袁华
FENG Lu;QIAN Yu;BAI Mengna;YUAN Hua(School of Management and Economics,University of Electronic Science and Technology of China,Chengdu 611731,China)
出处
《管理工程学报》
CSSCI
CSCD
北大核心
2020年第4期126-134,共9页
Journal of Industrial Engineering and Engineering Management
基金
国家自然科学基金资助项目(71572029、71671027、71490723、71271044)。
关键词
海量序列数据
相空间重构
相似性
预测
Massive sequence data
Phase space reconstruction
Similarity
Prediction