期刊文献+

一种基于模式增长的频繁模式挖掘算法 被引量:1

A pattern growth algorithm for frequent patterns mining
下载PDF
导出
摘要 提出了一种基于模式增长的频繁模式挖掘算法(简称为PGMiner算法).这种算法是一种深度挖掘的算法,不产生任选项集,便于发现较长的模式,避免了Apriori和FP-growth方法存在的问题.通过一种简单的索引结构在映射数据库中不断地增加模式长度.这种索引结构占用较少的内存,使得这种基于内存的算法有很高的执行效率.采用现实数据集以及IBM人工数据集对PGMiner算法进行测试.试验结果显示,对于一般类型的特别是较为稀疏的数据集,PGMiner算法比Apriori和FP-growth方法有更好的性能. A pattern growth algorithm for frequent patterns mining(called PGMiner algorithm) is presented.An indexing structure is adopted to grow the pattern length in a projected database,which may reduce the CPU time and save the memory consuming.The algorithm presedted in this paper is tested versus other algorithms on real world datasets and IBM artificial datasets.The empirical results illustrate that the PGMiner algorithm performs better than Apriori and FP-growth method when processing sparse data datasets that may contain long patterns.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2005年第z1期272-274,共3页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
关键词 频繁模式 模式增长 映射数据库 分治策略 frequent pattern pattern growth projected database divide-and-conquer
  • 相关文献

参考文献5

  • 1[1]Agrawal R,Imielinski T,Swami A.Mining association rules between sets of items in large databases[A].Proc 1993 ACM-SIGMOD Int'l Conf on Management of Data (SIGMOD'93)[C].Washington,1993.207-216
  • 2[2]HanJ P,Yin Y.Mining frequent patterns without candidate generation[A].Proc of the 2000 ACM SIGMOD International Conference on Management of Data[C].ACM Press,2000,Volume 29(2) of SIGMOD Record:1-12
  • 3[3]Zheng Zijian,Ron Kohavi,LIew Mason.Real world performance of association rule algorithms[A].Proc of the seventh ACM SIGKDD international conference on Knowledge Discovery and Data Mining [C].ACM Press,2001.401-406
  • 4[4]http:// fimi.cs.helsinki.fi/data/.
  • 5[5]http://www.almaden.ibm.com/cs/quest/syndata/html.

同被引文献12

  • 1刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量:26
  • 2Hulten G, Spencer L, Domingos P. Mining time- changing data streams[C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2001.97-106.
  • 3Maron O, Moore A. Hoeffding races: accelerating model selection search for classifieation and function approximation[J]. Advances in Neural Information Processing Systems. 1993, 6: 59-66.
  • 4Wang Haixun, Wei Fan, Yu P S, et al. Mining con- cept-drifting data streams using ensemble classifiers [C]//Proc of the Int'l Conf on Knowledge Discovery and Data Mining. New York: ACM Press, 2003: 226- 235.
  • 5Lee C H, Lin C R, Chen M S. Sliding-window filtering: an efficient method for incremental mining on a time variant database[J]. Inform System, 2005, 30 (3): 227-244.
  • 6Zhu Yunyue, Shasha D. Statstream: statistical monitoring of thousands of data streams in real time[C] //Proc of the 28th VLDB Conf. Hong Kong: VLDB Endowment, 2002: 358-369.
  • 7Jensen C S, Lin D, Ooi B C. Query and update efficient B+ TREE based indexing of moving objects [C] // Proceedings of VLDB. Toronto: VLDB Endowment, 2004: 768-779.
  • 8Yang Yiming, Lin Xin. A re-examination of text categorization methods[C]//22nd Annual International ACM SIGIR Conference on Research and Develop- ment in the Information Retrieval. New York: ACM Press, 1999: 42-49.
  • 9Duda R O, Hart P E. Pattern classification and scene analysis[M]. New York: Wiley, 1973.
  • 10Quinlan J R. Induction on decision trees[J]. Machine Learning, 1986, 13(1): 81-106.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部