期刊文献+

提高Eclat算法效率的策略 被引量:13

Strategies of efficiency improvement for Eclat algorithm
下载PDF
导出
摘要 为了提高Eclat算法的效率,从剪枝、项集连接和交叉计数3方面对Eclat算法进行优化.将后缀相同的项集归为一个等价类,使剪枝更充分,剪枝时引入双层哈希表加快搜索候选项集子集的速度;提出项集集合划分链表,以减少项集连接过程中比较判断的环节;提出事务标识(Tid)失去阈值,以加快交叉计数的速度.在此基础上提出一种优化的Eclat_opt算法(ZAKI),把它与Eclat原算法以及其他2种Eclat改进算法Diffset(ZAKI),hEclat(熊忠阳)进行对比实验的结果表明,Eclat_opt算法的效率在稀疏数据集上最高,总体时间性能最好. For the purpose.of efficiency improvement, Eclat algorithm was optimized in three aspectspruning, itemsets connection and intersection. Firstly, the equivalence classes were divided in the suffixbased way to make the best of pruning in which a double layer hash table was utilized to accelerate the search process of subsets of candidate itemsets. Secondly, a partition list of the set of itemsets was presented to eliminate the connection judgment of itemsets. Finally, a transaction id (Tid) lost threshold was introduced to speed up intersection. Based on the above three improvement strategies an Eclat_opt algorithm was proposed. The performance comparison between the Eclat_opt algorithm, the original Eclat algorithm (ZAKI) and two other improved Eclat algorithms Diffset(ZAKI), hEclat (XIONG Zhong-yang) showed that the efficiency of the Eclat_opt algorithm ranked the first among the four algorithms on sparse datasets, and its overall time performance was the best.
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2013年第2期223-230,共8页 Journal of Zhejiang University:Engineering Science
基金 国家自然科学基金资助项目(51175455) 浙江省自然科学基金资助项目(Y1100257)
关键词 Eclat算法 剪枝 双层哈希表 划分链表 交叉计数 Eclat algorithm pruning double layer hash table partition list intersection
  • 相关文献

参考文献11

  • 1AGRAWAL R, SRIKANT R. Fast Algorithms for min- ing association rules [C]// Proceedings of 20th Interna- tional Conference on Very Large Data Bases. Santiago, Chile: Morgankaufman, 1994:487 - 499.
  • 2HAN J, PEI J, YIN Y. Mining frequent patterns with- out candidate generation [C]/// Proeeedlngs of the 2000 ACM Data. Dallas, United States: ACM, 2000:1-12.
  • 3FENG Pei-en, ZHANG Hui, QIU Qing-ying, et al. PCAR: an efficient approach for mining association rules [C]/// Proceedings of the ICNC-FSKD 2008 Inter- national Conference on Fussy Systems and Knowledge Dis- covery. Jinan: IEEE, 2008:605-609.
  • 4ZAKI M J. Scalable algorithms for association mining[J]. IEEE Transactions on Knowledge and Data Engi- neering, 2000,12(3) : 372- 390.
  • 5宋长新,马克.改进的Eclat数据挖掘算法的研究[J].微计算机信息,2008,24(24):92-94. 被引量:17
  • 6ZAKI M J. Fast vertical mining using diffsets [R]. Technical Report 01-1, Troy, New York: Rensselaer Polytechnic Institute. 2001.
  • 7熊忠阳,陈培恩,张玉芳.基于散列布尔矩阵的关联规则Eclat改进算法[J].计算机应用研究,2010,27(4):1323-1325. 被引量:18
  • 8李敏,李春平.频繁模式挖掘算法分析和比较[J].计算机应用,2005,25(B12):166-171. 被引量:11
  • 9HAN J, KAMBE M. Data mining: concepts and Tech- niques [M]. San Francisco, United States: Morgan Kaufmann Publishers Inc, 2001 : 231.
  • 10刘井莲.Eclat与Eclat+算法的比较分析[J].绥化学院学报,2010,30(2):189-190. 被引量:1

二级参考文献25

  • 1Jia-WeiHan,JianPei,Xi-FengYan.From Sequential Pattern Mining to Structured Pattern Mining: A Pattern-Growth Approach[J].Journal of Computer Science & Technology,2004,19(3):257-279. 被引量:18
  • 2龙银香.移动计算环境下的数据挖掘研究[J].微计算机信息,2005,21(07X):35-38. 被引量:17
  • 3李敏,李春平.频繁模式挖掘算法分析和比较[J].计算机应用,2005,25(B12):166-171. 被引量:11
  • 4R. Agrawal, T. Imielinski, A. Swami. Mining association rules between sets of items in large databases, Proc. of the ACMSIGMOD 1993 Int'l Conference on Management of Data, Washington D.C., May 1993.
  • 5MJZaki, SParthasarathy, M.Ogihara, and W.Li. New algorithms for fast discovery of association rules, In Proc.of the 3rd Int'l Conf.on KDD and Data Mining (KDD'97), Newport Beach, California, August 1997.
  • 6HAND D, MANNILA H, SMYTH P. Principles of Data Mining[ M]. Massachusetts Institute of Technology, 2001.
  • 7MANNIEA H. Methods and problems in data mining[ A]. Proceedings of the 6th International Conference on Database Theory[ C],1997.41 -55.
  • 8KRISHNAMURTHY R, IMIELINSKI T. Practitioner Problems in Need of Database Research: Research Directions in Knowledge Discovery[A]. Vol. 20, No. 3 of SIGMOD Record[ C], Sept. 1991.76-78.
  • 9HAN J, KAMBER M. Data Mining: Concepts and Techniuqes[ M].Morgan Kaufmann Publishers, San Francisco, CA, 2001.
  • 10AGRAWAL R, IMIELINSKI T, A. Swami. Mining association rules between sets of items in large databases[ A]. Proceedings of ACM SIGMOD International Conference on Management of data [ C],1993. 207-216.

共引文献37

同被引文献123

引证文献13

二级引证文献49

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部