期刊文献+

基于频繁概念直乘分布的全局闭频繁项集挖掘算法 被引量:18

An Algorithm for Mining Global Closed Frequent Itemsets Based on Distributed Frequent Concept Direct Product
下载PDF
导出
摘要 基于概念格的集中式数据挖掘算法,不能充分地利用分布式计算资源来改善概念格构造效率,从而影响了挖掘算法的性能.文中进一步分析了Iceberg概念格并置集成的内在并行特性;以频繁概念直乘及其下覆盖为最小粒度,对Iceberg概念格并置集成过程进行分解和分布式计算;在对其正确性理论证明的基础上,提出了一个新颖的异构分布式环境下闭频繁项集全局挖掘算法.此算法利用Iceberg概念格的半格以及可并置集成特性,充分发挥了分布式环境下计算资源的优势.实验证明,在稠密数据集和稀疏数据集上,该挖掘算法都表现出较好的性能. With increasing distributed computing environment applied extensively,traditional center data mining algorithms which are based on concept lattice could not take full advantage of distributed computing resources to improve the time efficiency of constructing concept lattice.In consequence,the performance of mining algorithms could be affected.In this paper,we firstly further analyze the deep underlying parallel features of apposition assembly of Iceberg concept lattice.Secondly,we consider the sets which are consisted of the frequent concept direct produce and its lower cover as minimal computing units.And then those units can be scattered,handled distributively,and finally integrated into a global Iceberg concept lattice.The procedure of distributed assembly of Iceberg concept lattice is theoretically proved correct.Based on above works,a new algorithm is proposed to mine global closed frequent itemsets in heterogeneous distributed computing environment.This algorithm exploits the good quality of semi-lattice and apposition assembly construction,both of which are induced by Iceberg concept lattice.Therefore the algorithm has the ability to make the most of advantage of the computing sources in the distributed environment.It shows excellent efficiency of global data mining under both dense and sparse heterogeneous distributed data sets in experiments.
出处 《计算机学报》 EI CSCD 北大核心 2012年第5期990-1001,共12页 Chinese Journal of Computers
关键词 Iceberg概念格 分布式数据挖掘 并置集成 异构数据库 闭频繁项集 Iceberg concept lattice distributed data mining apposition assembly heterogeneous data scenario closed frequent itemsets
  • 相关文献

参考文献21

  • 1Ganter B, Wille R. Formal Concept Analysis, Mathematical Foundations. Berlin: Springer-Verlag. 1999.
  • 2Stumme G. Efficient data mining based on formal concept analysis//Proceedings of the 13th International Conference Database and Expert Systems Applications. DEXA 2002 Aix- en-Provence, France, LNCS 2453. Berlin: Springer-Verlag, 2002:534-546.
  • 3王黎明,张卓.基于iceberg概念格并置集成的闭频繁项集挖掘算法[J].计算机研究与发展,2007,44(7):1184-1190. 被引量:25
  • 4Hu Xue-Gang, Liu Wei, Wang De-Xing et al. Mining fre- quent itemsets using a pruned concept lattiee//Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Diseovery (FSKD). Haikou, China, 2007: 606- 610.
  • 5Park B, Kargupta H. Distributed data mining: Algorithms, systems, and Applications. Data Mining Handbook, 2002: 341-358.
  • 6Tan Pang-Ning, Steinhach Michael, Kumar Vipin. Introduc- tion to Data Mining. Beijing: Post~Telecom Press, 2006.
  • 7Han Jia-Wei, Kamber Micheline. Data Mining Concepts and Techniques. Beijing.. China Machine Press, 2001.
  • 8王黎明,赵辉.基于FP树的全局最大频繁项集挖掘算法[J].计算机研究与发展,2007,44(3):445-451. 被引量:16
  • 9Njiwoua P, Nguifo E M. A parallel algorithm to build con- cept lattice//Proeeedings of the 4th Groningen International Information Technology Conference for Students. Fevrier, Netherlands, 1997:103-107.
  • 10Kuznetsov S O, Obiedkov S A. Comparing performance of algorithms for generating concept lattices. Journal of Experi- mental and Theoretical artificial intelligence. 2002, 14(2/3) : 189-216.

二级参考文献73

共引文献61

同被引文献184

引证文献18

二级引证文献93

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部