期刊文献+

Parallel Frequent Pattern Discovery:Challenges and Methodology

Parallel Frequent Pattern Discovery:Challenges and Methodology
原文传递
导出
摘要 Parallel frequent pattern discovery algorithms exploit parallel and distributed computing resources to relieve the sequential bottlenecks of current frequent pattern mining (FPM) algorithms. Thus, parallel FPM algorithms achieve better scalability and performance, so they are attracting much attention in the data mining research community. This paper presents a comprehensive survey of the state-of-the-art parallel and distributed frequent pattern mining algorithms with more emphasis on pattern discovery from complex data (e.g., sequences and graphs) on various platforms. A review of typical parallel FPM algorithms uncovers the major challenges, methodologies, and research problems in the field of parallel frequent pattern discovery, such as work-load balancing, finding good data layouts, and data decomposition. This survey also indicates a dramatic shift of the research interest in the field from the simple parallel frequent itemset mining on traditional parallel and distributed platforms to parallel pattern mining of more complex data on emerging architectures, such as multi-core systems and the increasingly mature grid infrastructure. Parallel frequent pattern discovery algorithms exploit parallel and distributed computing resources to relieve the sequential bottlenecks of current frequent pattern mining (FPM) algorithms. Thus, parallel FPM algorithms achieve better scalability and performance, so they are attracting much attention in the data mining research community. This paper presents a comprehensive survey of the state-of-the-art parallel and distributed frequent pattern mining algorithms with more emphasis on pattern discovery from complex data (e.g., sequences and graphs) on various platforms. A review of typical parallel FPM algorithms uncovers the major challenges, methodologies, and research problems in the field of parallel frequent pattern discovery, such as work-load balancing, finding good data layouts, and data decomposition. This survey also indicates a dramatic shift of the research interest in the field from the simple parallel frequent itemset mining on traditional parallel and distributed platforms to parallel pattern mining of more complex data on emerging architectures, such as multi-core systems and the increasingly mature grid infrastructure.
出处 《Tsinghua Science and Technology》 SCIE EI CAS 2007年第6期719-728,共10页 清华大学学报(自然科学版(英文版)
基金 Supported by the Basic Research Foundation of Tsinghua Na-tional Laboratory for Information Science and Technology (TNList)
关键词 frequent pattern mining parallel computing dynamic load balancing frequent pattern mining parallel computing dynamic load balancing
  • 相关文献

参考文献12

  • 1Asif Javed,Ashfaq Khokhar.Frequent Pattern Mining on Message Passing Multiprocessor Systems[J].Distributed and Parallel Databases.2004(3)
  • 2Mohammed J. Zaki.SPADE: An Efficient Algorithm for Mining Frequent Sequences[J].Machine Learning (-).2001(1-2)
  • 3Shintani T,Kitsuregawa M.Mining algorithms for sequen- tial patterns in parallel: Hash based approach[].Proceed- ings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Min- ing.1998
  • 4Cong S,Han J,Padua D.Parallel mining of closed sequen- tial patterns[].Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discov- ery in Data Mining.2005
  • 5Wang J,Han J.BIDE: Efficient mining of frequent closed sequences[].Proceedings of the th International Con- ference on Data Engineering.2004
  • 6Cong S,Han J,Hoeflinger J,Padua D.A sampling-based framework for parallel data mining[].Proceedings of the th ACM Symposium on Principles and Practice of Parallel Programming.2005
  • 7Fatta G D,Berthold M R.Distributed mining of molecular fragments[].Proceeding of IEEE International Confer- ence on Data Mining Workshop on Data Mining and the Grid.2004
  • 8Fatta G D,Berthold M R.Dynamic load balancing for the distributed mining of molecular structures[].IEEE Transac- tions on Parallel and Distributed Systems.2006
  • 9Meinl T,Fischer I,Philippsen M.Parallel mining for fre- quent fragments on a shared-memory multiprocessor[].LWA German Research Center for Artificial Intelli- gence.2005
  • 10Meinl T,Worlein M,Fischer I, et al.Mining molecular datasets on symmetric multiprocessor systems[].Proceed- ings of the IEEE International Conference on Systems Man and Cybernetics.2006

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部