摘要
针对部分时间序列具有高维、大数据量及数据更新速度较快的特点,导致在原始时间序列上难以进行数据挖掘的问题,提出一种基于信息熵的时间序列分段线性表示方法———PLR_IE。该算法利用信息熵作为评判重要点数量的性能指标,从序列中提取重要分段点的数量分布情况,利用重要点组成的序列重新拟合原始时间序列,为下一步数据挖掘提供基础。实验结果表明,该方法能高效地提取出序列主要特征、拟合原始序列。
For some time series with high dimension, large amount of data and data renewal speed characteristics, resulting in the original time series data mining on difficult problems, this paper presented a method piecewise linear representation of time series method based on information entropy, which was the piecewise linear representation of time series algorithm PLR_IE. The algorithm used the information entropy as an evaluation of important points of performance indicators, from sequence to ex- tract important segment, the important point consisted of a sequence of fitting the original time series. Experiments show that this method can efficiently extract sequence, main characteristics, fitting the original sequence.
出处
《计算机应用研究》
CSCD
北大核心
2013年第8期2391-2394,共4页
Application Research of Computers
基金
山东省自然科学基金资助项目(ZR2011FQ029
ZR2011FL026)
山东省科技发展计划资助项目(2011YD01099
2011YD01100)
山东省高等学校科技计划资助项目(J11LG32)
关键词
时间序列
信息熵
分段线性表示
压缩率
拟合误差
time series
information entropy
piecewise linear representation
compression ratio
fitting error