摘要
基于关键点的符号化聚合近似(SAX)改进算法(KP_SAX)在SAX的基础上利用关键点对时间序列进行点距离度量,能更有效地计算时间序列的相似性,但对时间序列的模式信息体现不足,仍不能合理地度量时间序列的相似性。针对SAX与KP_SAX存在的缺陷,提出了一种基于SAX的时间序列相似性复合度量方法。综合了点距离和模式距离两种度量,先利用关键点将分段累积近似(PAA)法平均分段进一步细分成各个子分段;再用一个包含此两种距离信息的三元组表示每个子分段;最后利用定义的复合距离度量公式计算时间序列间的相似性,计算结果能更有效地反映时间序列间的差异。实验结果显示,改进方法的时间效率比KP_SAX算法仅降低了0.96%,而在时间序列区分度性能上优于KP_SAX算法和SAX算法。
Key point-based Symbolic Aggregate approximation (SAX) improving algorithm (KP SAX) uses key points to measure point distance of time series based on SAX, which can measure the similarity of time series more effectively. However, it is too short of information about the patterns of time series to measure the similarity of time series reasonably. To overcome the defects, a composite metric method of time series similarity measurement based on SAX was proposed. The method synthesized both point distance measurement and pattern distance measurement. First, key points were used to further subdivide the Piecewise Aggregate Approximation (PAA) segments into several sub-segments, and then a triple including the information about the two kinds of distance measurement was used to represent each sub-segment. Finally a composite metric formula was used to measure the similarity between two time series. The calculation results can reflect the difference between two time series more effectively. The experimental results show that the proposed method is only 0.96% lower than KP_SAX algorithm in time efficiency. However, it is superior to the KP_ SAX algorithm and the traditional SAX algorithm in differentiating between two time series.
出处
《计算机应用》
CSCD
北大核心
2013年第1期192-198,共7页
journal of Computer Applications
基金
国家自然科学基金资助项目(61070062
61175123)
福建高校产学合作科技重大项目(2010H6007)
关键词
时间序列
符号化聚合近似
相似性
模式距离
复合度量
time series
Symbolic Aggregate approximation (SAX)
similarity
pattern distance
composite metric