摘要
在序列模式挖掘应用中,约束是非常重要的。本文提出了一种新的约束-偏序约束,允许事务之间的间隔可以是无穷大。但是,本文间隔约束中事务之间的间隔只能是整数,所以可以把偏序约束看成是间隔约束的扩展。针对这个问题,提出了一种新颖的算法SPM(Sequential Pattern Maintenance,简称SPM)算法来解决偏序约束,采用含蓄分割技术把不满足偏序约束的数据序列分割出去,充分利用已挖掘出来的信息来解决由于数据序列数目变小使得支持度值变小的复杂情况。实验表明,SPM算法能够快速可扩展地挖掘出所有满足约束的频繁序列模式。
Constraints are essential for many sequential pattern mining apphcations. Ibis paper presents a new constraint called the partial order constraint, which allows the time duration between transactions to be infinite. But the duration can only be integer, so the partial order constraint can be considered to extend the duration constraint. An original algorithm called SPM(sequential pattern maintenance)is proposed. The SPM algorithm adopts an implicit segmentation technique which segments the dissatisfied constraint sequences from the existing ones, and makes full use of the information obtained from the previous mining processes to solve the case that the count of the support in DB becomes low because of the reduction of data sequences. The experimental results show that our approach is fast and scalable.
出处
《计算机工程与科学》
CSCD
2007年第5期86-89,共4页
Computer Engineering & Science
基金
河北省博士基金资助项目(B2003226)
关键词
数据挖掘
约束序列模式挖掘
偏序约束
含蓄分割
data mining
constraint sequential pattern mining
partial-order constraint
implicit segmentation