摘要
提出了一种基于点对线性近似的多数据流自适应分段算法,即快速点对线性近似的时序流(QPLAS)算法,它能够实时地对多数据流进行分段.采用一次扫描和滑动窗口工作机制,其主要思想是增量计算方法,能够在O(1)的时间复杂度内连续计算每个段的近似错误.为了同时处理多个数据流分段,将所有数据流当前未完成的段索引到一个B^+树索引当中.这样,QPLAS仅占用少量内存即可高效处理多个数据流的分段.实验结果表明QPLAS比传统方法快1~2个数量级.
An efficient algorithm QPLAS (quick piecewise linear approximation over time series streams) was proposed by using PLA (piecewise linear approximation) technology, which used the characteristic of incremental computation and could continuously compute approximation error of time series segment with the constant time complexity O(1). QPLAS could segment multiple data streams by indexing all unfinished current segments to a B^+-tree. Thus, QPLAS could handle the segmentation of multiple data streams with small resource consumption. The results of experiments showed that QPLAS is effective and efficient, and obtains 1-2 orders of magnitude performance improvement relative to traditional segmenting algorithm.
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2009年第5期64-67,共4页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
国家高技术研究发展计划资助项目(2007AA01Z309,2006AA01Z430)
衡阳师范学院青年基金资助项目(07A31)
关键词
数据挖掘
模式匹配
点对线性技术
数据流
分段算法
data mining
pattern matching
piecewise linear technique
data streams
segmentation algorithms