摘要
随着社交网络的兴起与发展,用户数目规模呈现出指数级增长的趋势。这些大规模数据里蕴含着许多有价值的信息,挖掘其中有用的信息已经成为学者研究的重点,好友推荐就是数据挖掘里的一个重要应用。为了获得更优的性能、更高的可扩展性,采用分布式平台解决大规模好友推荐成为学术界和工业界的一个发展趋势。目前使用得较广泛的为基于MapReduce框架的好友推荐算法,该方法有较高的可扩展性,但是受限于MapReduce低效的中间数据传输,存在性能缺陷。针对上述问题,提出一种基于分布式图计算框架的好友推荐算法。最后,在多个真实的社交网络数据集上评测了该方法。实验结果表明,该方法要优于业界先进的好友推荐算法,在准确率相当的情况下,性能大约为其他算法的7倍。
With the rise and development of social networking sites, the user number show a growth trend in exponential level, in these massive data there contains a lot of valuable information, and to mine the useful information has become the focus of the scholars in their studies. The friend recommendation algorithm is one of the most important applications in data mining. To acquire better performance and higher scalability, it becomes a developing trend in both the academia and the industry to use a distributed platform in solving the large-scale friend recommendation. Currently, the friend recommendation algorithm based on MapReduce framework has been widely used because of its high scalability. However, the inefficient transmission of the intermediate data of MapReduce results in the performance deficiencies. To solve these problems, the paper proposes a distributed graph computing framework-based friend recommendation algorithm. In end of the paper, we give the evaluation of the proposed algorithm on a couple of real social network datasets, and the experimental results show that it is superior to the advanced friend recommendation algorithms of the industry, and its performance is about seven times than that of other algorithms under the circumstance of similar accuracy.
作者
赵马沙
周薇
张豪
韩冀中
Zhao Masha;Zhou Wei;Zhang Hao;Han Jizhong(Institute of Information Engineering, Chinese Academy of Science, Beijing 100093 , China;University of Chinese Academy of Science, Beijing 100049 , China;School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065 , China)
出处
《计算机应用与软件》
CSCD
2016年第6期32-36,共5页
Computer Applications and Software
基金
国家自然科学基金项目(60903047)
国家高技术研究发展计划项目(2012AA01A401
2013AA013204)
中国科学院先导专项(XDA06030200)
关键词
好友推荐
分布式图计算框架
随机游走
Friend recommendation
Distributed graph computing framework
Random walk