期刊文献+

大规模图挖掘算法并行化研究

Survey on large scale graph mining parallelization
下载PDF
导出
摘要 目前大规模图挖掘算法的思路是基于MapReduce将矩阵与向量相乘的过程并行化,但却没有针对MapReduce特点对图数据进行划分,会产生大量中间结果,算法代价较高。针对这些问题,提出了GIM-V LI算法。该算法采用数据划分思想,将图矩阵横向划分,结合MapReduce特点以行为单位替代点或块的数据组织方式,并设计出<key,value>结构,使一个单位数据仅产生一个中间结果,从而大大减少了中间结果,提高了算法的性能。通过大量实验分析验证了该改进算法的正确性与有效性。 The design of large scale graph mining depends on the parallelization of matrix-vector multiplication based on MapReduce.But it will produce a large number of intermediate results due to the lack of data division.To reduce the cost of the algorithm,the GIM-V LI algorithm is proposed.The data partitioning ideas are adopted,the input graph data based on line is divided according to characteristic of MapReduce,and the structure of key,value pair is designed to reduce the intermediate results by one unit data producing only an intermediate result.Extensive experiments verify the correctness and effectiveness of the algorithm.
出处 《计算机工程与设计》 CSCD 北大核心 2012年第9期3465-3469,3474,共6页 Computer Engineering and Design
基金 国家自然科学基金项目(60803043 60873196 61033007) 国家863高技术研究发展计划基金项目(2009AA01A404)
关键词 大规模图挖掘 矩阵与向量相乘 数据划分 MAPREDUCE GIM-VLI large scale graph mining matrix-vector multiplication data partition MapReduce GIM-V LI
  • 相关文献

参考文献12

  • 1Kang U, Tsourakakis C, Appel A, et al. Hadi: Fast diameter estimation and mining in massive graphs with hadoop [D]. Pittsburgh: Carnegie Mellon University, 2008.
  • 2Chen J. Detecting communities in social networks using max- min modularity [J]. SDM, 2009, 3 (1): 20-24.
  • 3Falkowski T, Barth A. Dengraph: A density based community de- tection algorithm [J]. Web Intelligence, 2007, 1 (3): 10-14.
  • 4KE Y, CHENG J. Top-k correlative graph mining [J]. SDM, 2009, 2 (4): 150-163.
  • 5CHENG J, YU J. Fast graph pattern matching [C]. Cancun: ICDE, 2008: 913- 922.
  • 6Tsourakakis C E, Kang U. Doulion: Counting triangles in mas- sive graphs with a coin [C]. Paris: ACM, 2009. 407-416.
  • 7Qian T, Srivastava J. Simultaneouly finding fundamental arti- cles and new topics using a community tracking method [C]. USA: Springer, 2009: 796-803.
  • 8Jeddrey D, Sanjar G. MapReduee. Simplified data processing onlarge clusters [C]. New York: ACM, 2008: 137-150.
  • 9Jonathan C. Graph twiddling in a MapReduce world [J]. Com- puting in Science & Engineering, 2009, 2 (15): 29-41.
  • 10Jimmy L, Michael S. Design pattern for efficiet graph algo- rithrns in MapReduce [C]. New York: ACM, 2010: 78-85.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部