期刊文献+

基于自适应哈希链的分布式频繁模式挖掘算法 被引量:2

Distributed algorithm for mining frequent pattern based on adaptive hash chain structure
下载PDF
导出
摘要 针对分布式系统,提出了自适应哈希链结构的频繁模式挖掘算法。该算法首先在每个站点产生局部频繁1-项集,再产生全局频繁1-项集,根据全局频繁1-项集产生各站点的投影数据库,在各个站点分别扫描投影数据库中的交易,并根据站点可用内存情况形成相应大小的哈希链结构。通过挖掘各站点的哈希链结构得到全局频繁项集。给出了基本步骤和挖掘算法。研究表明该算法不但效率高,而且适应性强。 An algorithm for mining frequent pattern is put forward based on adaptive hash chain structures for a distributed system. In this algorithm, first the frequent 1-itemsets are generated at every site, then global frequent 1-itemsets are generated and the projection database of the global frequent 1-itemsets is formed at every site. After the transaction of the projection database is scanned at every site respectively, corresponding hash chain structures that are fit for the available memory are constructed at every site and mined to gain the global frequent itemsets. The basic process and the mining algorithm are presented. The study shows that the algorithm has higher efficiency and adaptability than the exiting approaches.
作者 叶飞跃
出处 《系统工程与电子技术》 EI CSCD 北大核心 2005年第3期560-564,共5页 Systems Engineering and Electronics
基金 江苏省高校自然科学研究计划基金资助课题(04KJB46003)
关键词 数据挖掘 频繁模式 分布式 自适应 哈希链 data mining frequent pattern distributed adaptive hash chain
  • 相关文献

参考文献7

  • 1Agrawal R, Srikant R. Fast algorithms for mining association rules[A].VLDB[C], 1994. 487-499.
  • 2Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation[A]. SIGMOD[C], 2000. 1- 12.
  • 3Pei J, Han J, Lu H, et al. H-Mine: hyper-structure mining of frequent in large database[A]. Proc. Int. Conf. on Data Mining[C], 2001. 38.
  • 4Park J S, Chen M S, Yu P S. Efficient parallel mining for association rules [ A ]. Proc. 4th Int. Conf. on information and Knowledge Management[C]. Baltimore, Maryland, 1995. 31-36.
  • 5Agrawal R, Shafer J C. Parallel mining of association rules: design,implementation, and experience[ J]. IEEE Trans. Knowledge and Data Engineering, 1996. 962 - 969.
  • 6Cheung David W, Han Jiawei, Ng Vincent T, et al. A fast distributed algorithm for mining association rules[A]. Proc. of 4th Int. Conf. on Parallel and Distributed Information Systems[ C], Miami Beach, Florida,December, 1996.31 - 43.
  • 7叶飞跃,王建东,陈慧萍,张有东.基于哈希链结构的频繁模式挖掘[J].计算机工程与应用,2004,40(11):174-176. 被引量:4

二级参考文献1

  • 1严蔚敏.数据结构[M].清华大学出版社,2001..

共引文献3

同被引文献23

  • 1陈慧萍,王建东,叶飞跃,王煜.基于FP-tree和支持度数组的最大频繁项集挖掘算法[J].系统工程与电子技术,2005,27(9):1631-1635. 被引量:2
  • 2Dong G, Pei J. Sequence data mining[M]. NewYork : Springer, 2007.
  • 3Han J, Cheng H, Xin D, et al. Frequent pattern mining: current status and future directions[J]. Data Mining and Knowledge Discovery, 2007, 15(1): 55- 86.
  • 4Agrawal R, Srikant R. Mining sequential patterns[C]//Proc. of the llth International Conference on Data Engineering, 1995: 3-14.
  • 5Pei J, Han J, Mortazavi-Asl B, et al. Mining sequential patterns by pattern growth : the PrefixSpan approach [J]. IEEE Trans. on Knowledge and Data Engineering, 2004, 16(11):1424 - 1440.
  • 6Zaki M J. SPADE: an efficient algorithm for mining frequent se quences[J]. Machine Learning, 2001, 42 (1/2) : 31 - 60.
  • 7Yah X, Han J, Afshar R. CloSpan: mining closed sequential patterns in large databases[C]//Proc, of the 3rd SIAM International Conference on Data Mining, 2003 : 166 - 177.
  • 8Wang J, Han J, Li C. Frequent closed sequence mining without candidate maintenance[J]. IEEE Trans. on Knowledge and Data Engineering, 2007, 19(8) :1042-1056.
  • 9Yang G. Computational aspects of mining maximal frequent patterns[J]. Theoretical Computer Science, 2006, 362 (1 - 3) : 63 - 85.
  • 10Arimura H. Efficient algorithms for mining frequent and closed patterns from semi-structured data[C]// Proc. of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 2008: 2- 13.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部