摘要
由Jeh和Widom提出的Sim Rank算法是一种普适"结构相似度"计算模型。由于Sim Rank算法采用迭代方式计算图节点间相似性,因此时间复杂度和空间复杂度都非常高。随着数据量的激增,单机运算能力不能满足大规模数据的计算要求。本文提出了基于Map Reduce计算模型的分布式Sim Rank算法,利用该算法对RDF图进行相似度度量,然后利用分布式的AP聚类算法对图节点进行聚类分析。实验结果表明,该方法能够高效的完成图节点的相似度度量,实现图的有效聚类。
The Sim Rank algorithm proposed by Jeh and Widom is a pervasive similarity calculation model. Because the Sim Rank algorithm using iterative calculation graph node similarity, so the time complexity and the space complexity is very high. With the rapid increase of data,single machine can not meet the requirements of mass data calculation. This paper presents a distributed Sim Rank algorithm based on Map Reduce computing model, we use it to measure the similarity of RDF,and then use the distributed AP clustering algorithm to cluster graph nodes. Experiment demonstrates this method can measure the similarity efficiently and implement graph clustering.
出处
《电子设计工程》
2015年第6期9-11,15,共4页
Electronic Design Engineering
基金
辽宁省自然科学基金(2013020014)
中国高等职业技术教育研究会规划课题(GZYGH1213036
GZYGH1213035)