期刊文献+

发现时间序列数据中的高质量惊奇模式

Finding Good-Quality Surprising Patterns in Time Series Data
下载PDF
导出
摘要 时间序列数据库中的惊奇模式发现是一个重要问题。已有的算法根据时间序列的形态特征定义并发现惊奇模式,而忽视时间序列内在的机理及其统计规律。为克服此缺点,提出基于时间序列预测的惊奇模式定义,即,其中包含了足够多例外的事件,并提出系统化的惊奇模式发现算法。首先将时间序列离散化为0和1组成的字串;然后用一个简单的算法从此字串中发现所有的惊奇模式。实验表明,所提算法不仅可以发现Keogh等人定义的惊奇模式,而且避免了发现无意义的惊奇模式。 Aim. Previous methods for finding surprising patterns in time series data suffer, in our opinion, three shortcomings: (1) they used very limited shape features of the time series data, (2) they ignored the statistical features of the time series data, and (3) they did not realize that utilizing suitable models can reduce the number of subsequences that have surprising patterns. We now present what we believe to be a better method. In the full paper, we explain our method in detail. In this abstract, we just add some pertinent remarks to the two topics of explanation: (1) the formal description of surprising pattern, (2) the algorithm for finding surprising patterns. In the first topic, we give a theorem and its proof and also five definitions. The three subtopics of the second topic are : the algorithm proposed by us (subtopic 2.1), the determination of the threshold values (subtopic 2.2), and the analysis of the computing complexity of the proposed algorithm (subtopic 2.3). In the second topic, we give a five-step flowchart, based on the theorem in the first topic, for finding surprising patterns. Most importantly, in subtopic 2.1, we explain the suitable modeling that reduces the number of subsequences that have surprising patterns. The algorithm achieves a rate of data compression about 32 : 1 or 64.1 ; so, it can be used in massive time series databases. The experimental results, given in a figure in the full paper, demonstrate preliminarily that the proposed method can not only find surprising patterns defined by Keogh et al but also omit those surprising patterns in the time series data that are not really surprising through suitable modeling.
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2007年第3期425-428,共4页 Journal of Northwestern Polytechnical University
基金 国家自然科学基金(60573096)资助
关键词 数据挖掘 时间序列 惊奇模式 知识获取 time series data, surprising pattern, modeling
  • 相关文献

参考文献4

  • 1Keogh E,Lonardi S,Chiu B.Finding Surprising Patterns in a Time Series Database in Linear Time and Space.Proc of SIGKDD.Edmonton,Alberta,Canada,2002
  • 2Shahabi C,Tian X,Zhao W.Tsa-Tree:A Wavelet-Based Approach to Improve the Efficiency of Multi-Level Surprise and Trend Query.Proc of 12th International Conference on Scientific and Statistical Database Management,Berlin,Germany,2000,56-68
  • 3Chakrabarti S,Sarawagi S,Dom B.Mining Surprising Patterns Using Temporal Description Length.Proc of the 24th VLDB,New York,USA,1998
  • 4李爱国,覃征.自适应局部线性化法预测混沌时间序列[J].系统工程理论与实践,2004,24(6):67-71. 被引量:9

二级参考文献9

  • 1[1]Farmer J D, Sidorowich J J. Predicting chaotic time series [J]. Phys Rev Lett, 1987,59: 845-848.
  • 2[2]Jayawardena A W, Li W K, Xu P. Neighbourhood selection for local modelling and prediction of hydrological time series[J]. Journal of Hydrology, 2002, 258: 40-57.
  • 3[3]Kugiumtzis D. State space reconstruction parameters in the analysis of chaotic time series - the role of the time window length[J]. Physica D, 1996, 95: 13-28.
  • 4[4]Reick C H, Page B. Time series prediction by multivariate next neighbor methods with application to zooplankton forecasts[J]. Mathematics and Computers in Simulation, 2000, 52: 289-310.
  • 5[5]Kantz H, Schreiber T. Nonlinear Time Series Analysis[M]. Cambridge University Press, 1997 (清华大学出版社,2000,影印本).
  • 6[7]Kugiumtzis D, Ling O C, Christophersen N. Regularized local linear prediction of chaotic time series[J]. Physica D, 1998, 112:344-360.
  • 7[9]程云鹏.矩阵论(第2版)[M]. 西安: 西北工业大学出版社, 2002. 227-228.
  • 8孙海云,曹庆杰.混沌时间序列建模及预测[J].系统工程理论与实践,2001,21(5):106-109. 被引量:21
  • 9沈辉,胡德文.基于正交最小二乘估计的非线性时间序列的预测[J].国防科技大学学报,2001,23(2):115-118. 被引量:5

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部