期刊文献+

大数据环境下关联规则并行分层挖掘算法研究 被引量:27

Parallel Hierarchical Association Rule Mining in Big Data Environment
下载PDF
导出
摘要 为满足大数据实时处理的需求,提出了一种基于划分的关联规则并行分层挖掘算法(Parallel Hierarchical Association Rule Mining,PHARM)。首先,将整个数据库D随机分割成若干个非重叠区域,并行挖掘出局部频繁项集;然后利用先验性质,连接局部频繁项集得全局候选项集;再次扫描D统计出每个候选项集的实际支持度,以确定全局频繁项集。最后,建模分析了该算法的高效性。 To deal with big data's demand of real-time processing,we proposed the parallel hierarchical association rule mining algorithm based on partitioning.First,the algorithm divides the transactions of Dinto n nonoverlapping partitions randomly,and all the local frequent itemsets mining is parallelized.Second,apriori property is utilized to collect frequent itemsets from all partitions and form the global candidate itemsets with respect to D.Then the actual support of each candidate is counted to determine the global frequent itemsets.At last,the algorithm's high efficiency was analyzed by modeling.
出处 《计算机科学》 CSCD 北大核心 2016年第1期286-289,共4页 Computer Science
基金 国家自然科学基金项目(61163010) 甘肃省自然科学基金(1308RJZA194)资助
关键词 大数据 划分 关联规则 并行分层挖掘 高效性 Big data Partition Association rule Parallel hierarchical mining High efficiency
  • 相关文献

参考文献5

二级参考文献218

  • 1米勒.云计算[M].史美林,姜进磊,孙瑞志,等译.北京:机械工业出版社,2009:125-128.
  • 2FOSTER I, YONG ZHAO, RAICU I, et al. Cloud computing and grid computing 360-degree compared[C] // Proceedings of the 2008 Grid Computing Environments Workshop. Washington, DC: IEEE Computer Society, 2008:1 - 10.
  • 3ARMBRUST M, FOX A, GRIFFITH R, et al. Above the clouds: A Berkeley view of cloud eomputing[EB/OL]. [2010 -01 -25]. http://www, eecs. berkeley, edu/Pubs/TechRpts/20Og/EECS-20og- 28. pdf.
  • 4BARROSO L A, DEAN J, HOLZLE U. Web search for a planet: the google cluster architecture[J]. IEEE Micro, 2003, 23(2) : 22 - 28.
  • 5CHIEN A, CALDER B, ELBERT S, et al. Entropia: Architecture and performance of an enterprise desktop grid system[J]. Journal of Parallel and Distributed Computing, 2003, 63(5):597-610.
  • 6KIM J S, NAM B, MARSH M, et al. Creating a robust desktop grid using peer-to-peer services[EB/OL]. [ 2009 - 10 - 16]. ftp://ftp. cs. umd. edu/pub/hpsl/papers/papers-pdf/ngs07.pdf.
  • 7ABRAHAM A, BUYYA R, NATH B. Nature's heuristics for scheduling jobs on computational grids[ C]// The 8th International Conference on Advanced Computing and Communications. New Delhi: Tata McGraw-Hill Publishing, 2000:45-52.
  • 8DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[ C]//Proceedings of the 6th Symposium on Operating System Design and Implementation. New York: ACM, 2004:137 - 150.
  • 9The CLOUDS Lab. Gridsim[ EB/OL]. [ 2010 - 06 - 25]. http:// www. cloudbus. org/gridsim/.
  • 10Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Proc. of the Int'l Conf. on Very Large Data Bases (VLDB). Santiago, 1994. 487-499.

共引文献2633

同被引文献216

引证文献27

二级引证文献79

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部