摘要
针对符号化聚合近似算法(SAX)中时间序列必须等长分割的缺陷,提出一种基于分割模式的时间序列符号化算法(SMSAX)。利用三角阈值法对随机抽样的时间序列进行特征提取,计算时间序列最大压缩比,将其作为时间窗宽提取分割点,进而求出时间序列的分割模式。利用得到的分割模式对时间序列进行分割降维,通过均值和波动率对分割后的子序列进行向量符号化。根据时间序列特征对其进行不等长分割,并加入波动率消除奇异点的影响。实验结果表明,SMSAX能获得比SAX更精确的结果。
Aiming at defects of equal-length segmentation of time series in symbolic aggregate approximation algorithm(SAX), a vector symbolic algorithm based on segmentation algorithm for time series(SMSAX) is presented. A triangular threshold method is used to extract features of time series which is sampled randomly. The time series maximum compression ratio is calculated as the time window width to extract segmentation points, and further the Segment Mode(SM) of the time series is found. The partition model is used to segment time series to reduce the dimensionality of them by using vector of mean and volatility of sub-sequences to symbolic them. The algorithm segments time sequences based on characters of them, and eliminates the impact of singular points with the fluctuation rate. Experimental results indicate that SMSAX is able to obtain more accurate results than SAX.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第4期55-57,共3页
Computer Engineering
基金
国家自然科学基金资助项目(60634020)
关键词
分割模式
时间序列
降维
子序列符号化
Segment Mode(SM)
time series
dimension reducing
sub-sequence symbolization