摘要
目前大规模图挖掘算法的思路是基于MapReduce将矩阵与向量相乘的过程并行化,但却没有针对MapReduce特点对图数据进行划分,会产生大量中间结果,算法代价较高。针对这些问题,提出了GIM-V LI算法。该算法采用数据划分思想,将图矩阵横向划分,结合MapReduce特点以行为单位替代点或块的数据组织方式,并设计出<key,value>结构,使一个单位数据仅产生一个中间结果,从而大大减少了中间结果,提高了算法的性能。通过大量实验分析验证了该改进算法的正确性与有效性。
The design of large scale graph mining depends on the parallelization of matrix-vector multiplication based on MapReduce.But it will produce a large number of intermediate results due to the lack of data division.To reduce the cost of the algorithm,the GIM-V LI algorithm is proposed.The data partitioning ideas are adopted,the input graph data based on line is divided according to characteristic of MapReduce,and the structure of key,value pair is designed to reduce the intermediate results by one unit data producing only an intermediate result.Extensive experiments verify the correctness and effectiveness of the algorithm.
出处
《计算机工程与设计》
CSCD
北大核心
2012年第9期3465-3469,3474,共6页
Computer Engineering and Design
基金
国家自然科学基金项目(60803043
60873196
61033007)
国家863高技术研究发展计划基金项目(2009AA01A404)