摘要
针对序列模式挖掘中,频繁子序列个数随模式长度增加而爆炸性增长的问题,提出一种从序列数据库中挖掘最大频繁序列模式的新算法(MFSPAN).MFSPAN充分利用不同序列可能具有相同前缀的性质来减少项集比较次数.在标准测试数据集上的实验结果表明了MFSPAN的有效性.
This paper proposes a novel algorithm MFSPAN (maximal frequent sequential pattern mining algorithm). MFSPAN is used to mine the complete set of maximal frequent sequential patterns in sequence databases. It solves the problem that the number of frequent subsequences will increase explosively as frequent patterns become longer: because MFSPAN takes full advantage of the property that different sequences may share a common prefix to reduce itemset comparing times. Experiments on standard test data show that MFSPAN is very effective.
出处
《吉林大学学报(理学版)》
CAS
CSCD
北大核心
2006年第4期570-574,共5页
Journal of Jilin University:Science Edition
基金
国家自然科学基金(批准号:60433020)
教育部"符号计算和知识工程重点实验室"资助项目基金
关键词
序列模式
最大序列模式
长模式
深度优先
sequential pattern
maximal sequential pattern
long pattern
depth-first