一种新的频繁闭项目集挖掘算法(英文)

New algorithm of mining frequent closed itemsets

下载PDF

导出

摘要为了解决频繁闭项目集挖掘中时间和存储开销大的问题,提出了一种基于FC-tree(频繁闭模式树)的频繁闭项目集挖掘算法max-FCIA(最大频繁闭项目集挖掘算法).该算法利用哈希表映射事务数据库,通过对哈希表进行操作从而得到所有频繁项目集的支持度,进而生成包含所有频繁项目的有序树.经过剪枝处理的有序树就是包含所有最小频繁闭项目集的FC-tree,最后用最小频繁闭项目集生成频繁闭项目集.实验结果表明,该算法通过映射事务数据库,减少了扫描数据库所浪费的时间,提高程序执行效率.另外,运用有效的剪枝策略,避免了不必要候选项目集的生成,节省了存储空间,实验证明该算法是有效的. A new algorithm based on an FC-tree （frequent closed pattern tree） and a max-FCIA （maximal frequent closed itemsets algorithm） is presented, which is used to mine the frequent closed itemsets for solving memory and time consuming problems. This algorithm maps the transaction database by using a Hash table,gets the support of all frequent itemsets through operating the Hash table and forms a lexicographic subset tree including the frequent itemsets.Efficient pruning methods are used to get the FC-tree including all the minimum frequent closed itemsets through processing the lexicographic subset tree.Finally,frequent closed itemsets are generated from minimum frequent closed itemsets.The experimental results show that the mapping transaction database is introduced in the algorithm to reduce time consumption and to improve the efficiency of the program.Furthermore,the effective pruning strategy restrains the number of candidates,which saves space.The results show that the algorithm is effective.

作者张亮任永功付玉

机构地区辽宁师范大学计算机与信息技术学院

出处《Journal of Southeast University(English Edition)》 EI CAS 2008年第3期335-338,共4页 东南大学学报（英文版）

基金 The National Natural Science Foundation of China(No.60603047) the Natural Science Foundation of Liaoning Province Liaoning Higher Education Research Foundation(No.2008341)

关键词频繁项目集频繁闭项目集最小频繁闭项目集最大频繁闭项目集频繁闭模式树 frequent itemsets frequent closed itemsets minimum frequent closed itemsets maximal frequent closed itemsets frequent closed pattern tree

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献1

1刘君强,孙晓莹,庄越挺,潘云鹤.挖掘闭合模式的高性能算法[J].软件学报,2004,15(1):94-102. 被引量：19

二级参考文献8

1[1]Pasquier N, Bastide Y, Taouil R, Lakhal L. Discovering frequent closed itemsets for association rules. In: Beeri C, et al, eds. Proc. of the 7th Int'l. Conf. on Database Theory. Jerusalem: Springer-Verlag, 1999. 398～416.
2[2]Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Beeri C, et al, eds. Proc. of the 20th Int'l. Conf. on Very Large Databases. Santiago: Morgan Kaufmann Publishers, 1994. 487～499.
3[3]Pei J, Han J, Mao R. CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Gunopulos D, et al, eds. Proc. of the 2000 ACM SIGMOD Int'l. Workshop on Data Mining and Knowledge Discovery. Dallas: ACM Press, 2000. 21～30.
4[4]Burdick D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Georgakopoulos D, et al, eds. Proc. of the 17th Int'l. Conf. on Data Engineering. Heidelberg: IEEE Press, 2001. 443～452.
5[5]Zaki MJ, Hsiao CJ. CHARM: An efficient algorithm for closed itemset mining. In: Grossman R, et al, eds. Proc. of the 2nd SIAM Int'l. Conf. on Data Mining. Arlington: SIAM, 2002. 12～28.
6[6]Liu JQ, Pan YH, Wang K, Han J. Mining frequent item sets by opportunistic projection. In: Hand D, et al, eds. Proc. of the 8th ACM SIGKDD Int'l. Conf. on Knowledge Discovery and Data Mining. Alberta: ACM Press, 2002. 229～238.
7[7]Srikant R. Quest synthetic data generation code. San Jose: IBM Almaden Research Center, 1994. http://www.almaden.ibm.com/ software/quest/Resources/index.shtml
8[8]Blake C, Merz C. UCI Repository of machine learning. Irvine: University of California, Department of Information and Computer Science, 1998. http://www.ics.uci.edu/～mlearn/MLRepository.html

共引文献18

1张莹,韩芳溪,柴乔林.基于频繁模式树的AOI聚类算法[J].计算机工程与应用,2004,40(35):178-179.
2刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量：26
3杨萍,李立乡,杨明.快速更新频繁闭合项目集算法[J].计算机工程与应用,2006,42(36):148-151. 被引量：1
4刘旭,毛国君,孙岳,刘椿年.数据流中频繁闭项集的近似挖掘算法[J].电子学报,2007,35(5):900-905. 被引量：14
5程转流,胡为成,胡学钢.基于DSFCI-tree的分布式数据流频繁闭合模式挖掘[J].微电子学与计算机,2007,24(9):120-122. 被引量：2
6宋威,杨炳儒,徐章艳,张桃红.基于索引数组和复合频繁模式树的频繁闭项集挖掘算法[J].计算机科学,2007,34(8):165-167. 被引量：1
7缪裕青,陈国良,徐云.基因表达数据的频繁闭合模式挖掘新算法[J].中国科学技术大学学报,2007,37(9):1080-1087. 被引量：1
8郭宇红,童云海,唐世渭,杨冬青.基于FP-Tree的反向频繁项集挖掘[J].软件学报,2008,19(2):338-350. 被引量：20
9缪裕青,金波,陈国良.HTCLOSE：快速挖掘微阵列数据集中的频繁闭合模式[J].小型微型计算机系统,2008,29(2):274-278.
10程转流,胡学钢.数据流中频繁闭合模式的挖掘[J].计算机工程,2008,34(16):50-52. 被引量：4

1任永功,张亮,付玉,吕君义.基于FC-tree的频繁闭项目集挖掘算法[J].计算机科学,2008,35(9):149-152. 被引量：1
2朱玉全,宋余庆.频繁闭项目集挖掘算法研究[J].计算机研究与发展,2007,44(7):1177-1183. 被引量：10
3李天瑞,徐扬,潘无名.φ频繁闭项目挖掘问题及其算法[J].西南交通大学学报,2001,36(3):225-228. 被引量：3
4李世松,柴晓辉,宋顺林.FCIM:一种新的闭模式挖掘算法[J].计算机工程与应用,2007,43(33):180-183. 被引量：2
5朱玉文,陈陵涛,刘万春,贾云得.基于频繁闭项目集的关联规则挖掘算法[J].北京理工大学学报,2003,23(3):345-349. 被引量：2
6方刚,王佳乐,应宏,汤小斌.基于粒度计算的频繁闭项目集挖掘[J].计算机工程与应用,2014,50(20):130-134. 被引量：1
7朱玉全,吕晓,陈耿.频繁闭项目集更新算法[J].江苏大学学报（自然科学版）,2008,29(4):335-338.
8第六代光纤通道协议[J].办公自动化,2014(5):22-22.
9陈健美,朱玉全,宋顺林,桂长青,宋余庆.全局频繁闭项目集挖掘算法研究[J].计算机科学,2008,35(1):193-195.
10陈健美,朱玉全,倪巍伟,宋余庆,宋顺林,桂长青.一种分布式全局频繁闭项目集快速挖掘更新算法[J].小型微型计算机系统,2008,29(7):1237-1240. 被引量：3

Journal of Southeast University(English Edition)

2008年第3期

浏览历史

内容加载中请稍等...

一种新的频繁闭项目集挖掘算法(英文)

参考文献1

二级参考文献8

共引文献18

相关作者

相关机构

相关主题

浏览历史