期刊文献+

基于包含索引的频繁闭序列模式挖掘的新算法 被引量:1

New mining algorithm for frequent closed sequential pattern based on subsume index
下载PDF
导出
摘要 频繁闭序列模式惟一确定全体频繁序列模式,且规模小得多。传统的闭序列模式挖掘算法对每个频繁项目都进行扩展,往往会产生大量的非闭合序列。为解决这一问题,提出了一种新的基于包含索引的频繁闭序列模式挖掘算法,其主要思想是只对闭项集进行扩展,大大减少了非闭合序列的产生。首先,论证了闭序列模式只能由闭项集组成;其次,说明了如何利用包含索引来快速发现闭项集;最后,给出了一种深度优先的挖掘频繁闭序列模式的新算法。实验结果表明,该算法具有较高的效率。 The set of frequent closed sequential pattern determines exactly the complete set of all frequent sequential patterns and is usually much smaller than the latter. Traditional closed sequential pattern mining algorithms extend a frequent sequence with every frequent single item, which leads to the generation of a lot of non-closed sequence. To solve these problems, a new mining algorithm for frequent closed sequential pattern based on subsume index is proposed. The main idea of the proposed algorithm is to extend a frequent sequence with closed itemsets only. Thus, the generation of non-closed sequences is avoided greatly. Firstly, it is proved that a closed sequential pattern is only composed of closed itemsets. Then, it is explained that the closed item sets can be discovered efficiently by using a subsume index. Finally, a depth-first algorithm for mining frequent closed sequential pattern is presented. The experimental results show that the proposed algorithm is efficient.
出处 《系统工程与电子技术》 EI CSCD 北大核心 2009年第10期2485-2488,共4页 Systems Engineering and Electronics
基金 国家自然科学基金(60675030) 北京市属市管高等学校人才强教计划资助课题
关键词 数据挖掘 频繁闭项集 频繁闭序列模式 包含索引 data mining frequent closed itemset frequent closed sequence pattern subsume index
  • 相关文献

参考文献14

  • 1Dong G, Pei J. Sequence data mining[M]. NewYork : Springer, 2007.
  • 2Han J, Cheng H, Xin D, et al. Frequent pattern mining: current status and future directions[J]. Data Mining and Knowledge Discovery, 2007, 15(1): 55- 86.
  • 3Agrawal R, Srikant R. Mining sequential patterns[C]//Proc. of the llth International Conference on Data Engineering, 1995: 3-14.
  • 4Pei J, Han J, Mortazavi-Asl B, et al. Mining sequential patterns by pattern growth : the PrefixSpan approach [J]. IEEE Trans. on Knowledge and Data Engineering, 2004, 16(11):1424 - 1440.
  • 5Zaki M J. SPADE: an efficient algorithm for mining frequent se quences[J]. Machine Learning, 2001, 42 (1/2) : 31 - 60.
  • 6Yah X, Han J, Afshar R. CloSpan: mining closed sequential patterns in large databases[C]//Proc, of the 3rd SIAM International Conference on Data Mining, 2003 : 166 - 177.
  • 7Wang J, Han J, Li C. Frequent closed sequence mining without candidate maintenance[J]. IEEE Trans. on Knowledge and Data Engineering, 2007, 19(8) :1042-1056.
  • 8叶飞跃.基于自适应哈希链的分布式频繁模式挖掘算法[J].系统工程与电子技术,2005,27(3):560-564. 被引量:2
  • 9陈慧萍,王建东,叶飞跃,王煜.基于FP-tree和支持度数组的最大频繁项集挖掘算法[J].系统工程与电子技术,2005,27(9):1631-1635. 被引量:2
  • 10Yang G. Computational aspects of mining maximal frequent patterns[J]. Theoretical Computer Science, 2006, 362 (1 - 3) : 63 - 85.

二级参考文献26

  • 1叶飞跃,王建东,庄毅,吕宗磊.一种挖掘频繁模式的数据库划分新方法[J].系统工程与电子技术,2004,26(11):1666-1668. 被引量:3
  • 2秦亮曦,史忠植.SFPMax——基于排序FP树的最大频繁模式挖掘算法[J].计算机研究与发展,2005,42(2):217-223. 被引量:26
  • 3宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量:21
  • 4Agrawal R, Srikant R. Fast algorithms for mining association rules[A].VLDB[C], 1994. 487-499.
  • 5Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[A]. SIGMOD[C], 2000. 1- 12.
  • 6Pei J, Han J, Lu H, et al. H-Mine: hyper-structure mining of frequent in large database[A]. Proc. Int. Conf. on Data Mining[C], 2001. 38.
  • 7Park J S, Chen M S, Yu P S. Efficient parallel mining for association rules [ A ]. Proc. 4th Int. Conf. on information and Knowledge Management[C]. Baltimore, Maryland, 1995. 31-36.
  • 8Agrawal R, Shafer J C. Parallel mining of association rules: design,implementation, and experience[ J]. IEEE Trans. Knowledge and Data Engineering, 1996. 962 - 969.
  • 9Cheung David W, Han Jiawei, Ng Vincent T, et al. A fast distributed algorithm for mining association rules[A]. Proc. of 4th Int. Conf. on Parallel and Distributed Information Systems[ C], Miami Beach, Florida,December, 1996.31 - 43.
  • 10Agrawal R, Imielinski T, Swami A N. Mining association rules between sets of items in large databases[A]. In P. Buneman and S.Jajodia, editors, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data[C]. SIGMOD Record,ACMPress, 1993, 22(2): 207- 216.

共引文献12

同被引文献19

  • 1HERNANDEZ-LEON R, PALANCAR J H, CARRASCO-OCHOA J A, et al. Algorithms for mining frequent itemsets in static and dynamic datasets [ J ]. Intelligent Data Analysis, 2010, 14(3) :419-435.
  • 2HAN J, KAMBER M. Data mining: concepts and techniques[M]. 2nd ed. San Francisco, CA, USA: Morgan Kaufmann Publisher, 2006.
  • 3PIATETSKY-SHAPIRO G. Data mining and knowledge discovery 1996 to 2005 : overcoming the hype and moving from "university" to "business" and "analytics" [ J ]. Data Mining Knowledge Discovery, 2007, 15 ( 1 ) : 99- 105.
  • 4CHIANG D A, WANG Y F, WANG Y H, et al. Mining disjunctive consequent association rules [J]. Applied Soft Computing, 2011, 11(2): 2129-2133.
  • 5AGRAWAL R, IMIELINSKI T, SWAMI A. Mining associations between sets of items in massive databases[C]//Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. Washington D C, USA: ACM Press, 1993. 207-216.
  • 6AGRAWAL R, SRIKANT R. Fast algorithms for mining association rules in large databases [ C ]//Proceedings of the 20th International Conference on Very Large Data Bases. Santiago de Chile, Chile: Morgan Kaufmann Publisher, 1994 : 487-499.
  • 7SONG W, YANG B R, XU Z Y. Index-BitTableFI: an improved algorithm for mining frequent itemsets [ J ]. Knowledge-Based Systems, 2008, 21 (6): 507-513.
  • 8VREEKEN J, LEEUWEN M, SIEBES A. Krimp: mining itemsets that compress [J].Data Mining Knowledge Discovery, 2011, 23 ( 1 ) : 169-214.
  • 9HAN J, PEI J, YIN Y, et al. Mining frequent patterns without candidate generation: a frequent-pattern tree approach [ J ]. Data Mining and Knowledge Discovery, 2004, 8 ( 1 ) : 53-87.
  • 10LIU G, LU H, LOU W, et al. Efficient mining of frequent patterns using ascending frequency ordered prefix- tree [ J ]. Data Mining and Knowledge Discovery, 2004, 9 (3) : 249-274.

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部