摘要
现有的增量式挖掘算法在支持度发生变化时,需要对序列数据库进行重复挖掘,为减少由此产生的时空消耗,提出一种高效的增量式序列模式挖掘算法。算法采用频繁序列树作为序列存储结构,当序列数据库和最小支持度发生变化时,通过执行更新操作,实现频繁序列树的更新,利用深度优先遍历频繁序列树找到序列数据库中所有的序列模式。实验结果表明,与IncSpan算法和PrefixSpan算法相比,该算法的挖掘效率较高。
In order to solve the problem that the existed incremental mining algorithms need to mine the sequence database once again,and reduce the time and space consumption generated by repeatly running mining algorithm in the process of the sequential pattern mining,this paper proposes an efficient incremental mining algorithm of sequential patterns.It uses the frequent sequence tree as the storage structure of the algorithm.When the sequence database is updated and the minimum support is changed,it updates the frequent sequence tree by performing the update operation.It finds all the sequential patterns through using depth-first search strategy to traverse the frequent sequence tree.Experimental results show that the algorithm outperforms IncSpan and PrefixSpan in time cost.
出处
《计算机工程》
CAS
CSCD
2012年第12期39-41,共3页
Computer Engineering
基金
国家自然科学基金资助项目(61170190)
秦皇岛市科学技术研究与发展计划基金资助项目(201001A018)
关键词
数据挖掘
增量式挖掘
序列模式
投影数据库
频繁序列树
data mining
incremental mining
sequential pattern
project database
frequent sequence tree