摘要
提出一种基于多项式回归分析的相似性度量和时间序列相似模式抽取的系统化方法,其基本思路是用一个分段多项式回归模型近似一个时间序列,把原始序列映射到多项式系数张成的特征空间,并推导出此特征空间的欧几里德距离作为相似性度量,从而自然地把原始序列分为一个不重叠的有序子序列集合,然后对这个子序列集合进行聚类,得到一组不重叠的模式.所提方法还定义了不等长度时间序列相似的概念.说明了一些著名的分段直线表示(PLR)法是所提方法的特例,并给出了理论分析和实验结果.
Similarity search and similarity-based knowledge discovery in large time series databases have attracted research interest recently. The basis of the two applications is the metrics of similarity and the methods of extracting patterns in time series data. The metrics of similarity and a systematic method of finding similar patterns in time series data based on polynomial regression are proposed. A segmented polynomial model is employed to approximate the time series data in order to map the original time series data to a modality space that is spanned by the coefficients of the polynomial model, and Euclidean distance is derived in the space as the metrics of similarity, then the time series is divided into non-overlapping subsequences that can be clustered into variety patterns. The similarity is defined between difference length time series. Theoretical analysis and experimental results demonstrate that some well-known piecewise linear representation (PLR) methods are special situations of the proposed method.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2002年第12期1275-1278,共4页
Journal of Xi'an Jiaotong University
基金
陕西省科学技术发展计划“十五”攻关项目(2000K08-G12).