一种基于模式增长的频繁模式挖掘算法被引量：1

A pattern growth algorithm for frequent patterns mining

下载PDF

导出

摘要提出了一种基于模式增长的频繁模式挖掘算法(简称为PGMiner算法).这种算法是一种深度挖掘的算法,不产生任选项集,便于发现较长的模式,避免了Apriori和FP-growth方法存在的问题.通过一种简单的索引结构在映射数据库中不断地增加模式长度.这种索引结构占用较少的内存,使得这种基于内存的算法有很高的执行效率.采用现实数据集以及IBM人工数据集对PGMiner算法进行测试.试验结果显示,对于一般类型的特别是较为稀疏的数据集,PGMiner算法比Apriori和FP-growth方法有更好的性能. A pattern growth algorithm for frequent patterns mining(called PGMiner algorithm) is presented.An indexing structure is adopted to grow the pattern length in a projected database,which may reduce the CPU time and save the memory consuming.The algorithm presedted in this paper is tested versus other algorithms on real world datasets and IBM artificial datasets.The empirical results illustrate that the PGMiner algorithm performs better than Apriori and FP-growth method when processing sparse data datasets that may contain long patterns.

作者侯俊杰李春平

机构地区清华大学软件学院

出处《华中科技大学学报（自然科学版）》 EI CAS CSCD 北大核心 2005年第z1期272-274,共3页 Journal of Huazhong University of Science and Technology(Natural Science Edition)

关键词频繁模式模式增长映射数据库分治策略 frequent pattern pattern growth projected database divide-and-conquer

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献5

1[1]Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[A].Proc 1993 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'93)[C].Washington,1993.207-216
2[2]HanJ P,Yin Y.Mining frequent patterns without candidate generation[A].Proc of the 2000 ACM SIGMOD International Conference on Management of Data[C].ACM Press,2000,Volume 29(2) of SIGMOD Record:1-12
3[3]Zheng Zijian,Ron Kohavi,LIew Mason.Real world performance of association rule algorithms[A].Proc of the seventh ACM SIGKDD international conference on Knowledge Discovery and Data Mining [C].ACM Press,2001.401-406
4[4]http:// fimi.cs.helsinki.fi/data/.
5[5]http://www.almaden.ibm.com/cs/quest/syndata/html.

同被引文献12

1刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量：26
2Hulten G, Spencer L, Domingos P. Mining time- changing data streams[C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2001.97-106.
3Maron O, Moore A. Hoeffding races: accelerating model selection search for classifieation and function approximation[J]. Advances in Neural Information Processing Systems. 1993, 6: 59-66.
4Wang Haixun, Wei Fan, Yu P S, et al. Mining con- cept-drifting data streams using ensemble classifiers [C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 226- 235.
5Lee C H, Lin C R, Chen M S. Sliding-window filtering: an efficient method for incremental mining on a time variant database[J]. Inform System, 2005, 30 (3): 227-244.
6Zhu Yunyue, Shasha D. Statstream: statistical monitoring of thousands of data streams in real time[C] //Proc of the 28th VLDB Conf. Hong Kong: VLDB Endowment, 2002: 358-369.
7Jensen C S, Lin D, Ooi B C. Query and update efficient B+ TREE based indexing of moving objects [C] // Proceedings of VLDB. Toronto: VLDB Endowment, 2004: 768-779.
8Yang Yiming, Lin Xin. A re-examination of text categorization methods[C]//22nd Annual International ACM SIGIR Conference on Research and Develop- ment in the Information Retrieval. New York: ACM Press, 1999: 42-49.
9Duda R O, Hart P E. Pattern classification and scene analysis[M]. New York: Wiley, 1973.
10Quinlan J R. Induction on decision trees[J]. Machine Learning, 1986, 13(1): 81-106.

引证文献1

1方湘艳,于燕婷,丁宜栋,熊庭刚.基于时间窗口权值的数据流分类算法[J].华中科技大学学报（自然科学版）,2011,39(1):41-44.

1王立军,宋余庆,谢从华,吕颖.基于二叉频繁模式树的医学图像关联规则挖掘[J].计算机工程与应用,2006,42(13):182-184. 被引量：3
2梁鹰,罗伟其.基于B/S的异构数据库信息集成的系统设计与实现[J].计算机工程,2000,26(12):23-25. 被引量：13
3郑玲霞,李大学,马万里.基于有向图的关联规则算法[J].重庆邮电学院学报（自然科学版）,2005,17(4):495-498. 被引量：5
4徐洲.互动教学系统的设计与开发研究[J].宜春学院学报,2013,35(9):64-67.
5李世钊,张炜,王建兵,康宗绪,李云峰.一种名址分离网络设计与实现[J].通信技术,2015,48(11):1310-1314. 被引量：4
6袁春明.J2EE应用程序中利用Java反射机制开发BMP[J].计算机工程与科学,2006,28(9):116-118.
7曹风华.改进的基于两个矩阵的关联规则挖掘算法[J].电子科技,2012,25(5):126-128. 被引量：3
8段明秀.关联规则挖掘中Apriori算法的改进[J].沈阳师范大学学报（自然科学版）,2008,26(4):442-445. 被引量：2
9聂荣,余建国,张洪欣,吕英华.IP地址地理位置映射技术[J].计算机工程,2008,34(15):102-104. 被引量：8
10杜金星,张暑军,张黎明.基于XML的数据映射技术[J].电脑知识与技术,2007(3):1273-1273. 被引量：2

华中科技大学学报（自然科学版）

2005年第z1期

浏览历史

内容加载中请稍等...

一种基于模式增长的频繁模式挖掘算法被引量：1

参考文献5

同被引文献12

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于模式增长的频繁模式挖掘算法 被引量：1

参考文献5

同被引文献12

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于模式增长的频繁模式挖掘算法被引量：1