期刊文献+

信息网络中一个有效的基于链接的结点相似度度量 被引量:3

Effective Link-Based Measure of Node Similarity on Information Networks
下载PDF
导出
摘要 信息网络无处不在.通过把网络中的对象抽象为点,把对象之间的关系刻画为边,相应的信息网络就可以用图来表示.图中结点相似度计算是图数据管理中的基本问题,在很多领域都有运用,比如社会网络分析、信息检索和推荐系统等.其中,著名的相似度度量是以Personalized Page Rank和Sim Rank为代表.这两种度量本质都是以图中的路径来定义,然而它们侧重的路径截然不同.为此,提出了一个度量Super Sim Rank.它不仅涵盖了这些路径,而且考虑了Personalized Page Rank和Sim Rank两者都没有考虑的路径,从而能够更加体现出这种链接关系的本质.在此基础上对Super Sim Rank进行了理论分析,从而提出了相应的优化算法,使得计算性能从最坏情况O(kn4)提高到O(knl).这里,k是迭代次数,n是结点数,l是边数.最后,通过实验验证了Super Sim Rank优于Sim Rank和Personalized Page Rank,同时验证了优化算法在各种情况下都是有效的. Information networks are ubiquitous. These networks can be modeled by graphs where nodes refer to objects and edges are relationships between objects in the networks. Measuring nodes similarity in a graph is a fundamental problem of graph data management. There are many real-world applications based on similarity between objects, such as networks analyses, information retrieval and recommendation systems. A number of link-based similarity measures have been proposed. Both SimRank and personalized PageRank are the most popular and influential similarity measures. The two measures differ from each other because they depend on different paths. This paper proposes a similarity measure that takes advantages of both SimRank and personalized PageRank. The new measure strengthens the principle of SimRank: "Two objects are similar if they are referenced by similar objects". The paper also analyzes the new similarity measure in theory and proposes an optimization algorithm which reduces the time cost from O(kn^4) to O(knl) where k is the number of iteration, n is the number of nodes and l is the number of edges. Experimental results demonstrate the effectiveness of the new similarity measure and efficiency of the optimization algorithm.
出处 《软件学报》 EI CSCD 北大核心 2014年第11期2602-2615,共14页 Journal of Software
基金 国家重点基础研究发展计划(973)(2014CB340402 2012CB316205) 国家自然科学基金(61272137 61033010 61202114) 国家社会科学基金(12&ZD220) 国家高技术研究发展计划(863)(2014AA015200) 国家高等学校学科创新引智计划
关键词 随机游走:相似度度量 SIMRANK PERSONALIZED PAGERANK random walk similarity measure SimRank personalized PageRank
  • 相关文献

参考文献33

  • 1Liben-Nowell D, Kleinberg J. The link prediction problem for social networks. Journal of the American Society for Information Science and Technology, 2007,58(7): 1019-1031. [doi: 10.1002/asi.20591].
  • 2Antonellis I, Molina HG, Chang CC. Simrank++: Query rewriting through link analysis of the click graph. Proc. of the VLDB Endowment, 2008,1(1):408-421. [doi: 10.14778/1453856.1453903].
  • 3Jin R, Lee VE, Hong H. Axiomatic ranking of network role similarity. In: Proc. of the 17th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2011). New York: ACM Press, 2011. 922-930. [doi: 10.1145/2020408.2020561].
  • 4Gy6ngyi Z, Garcia-Molina H, Pedersen J. Combating Web spam with trustrank. In: Proe. of the 30st Int'l Conf. on Very Large Data Bases (VLDB 2004). New York: ACM Press, 2004. 576-587.
  • 5Gupta P, Goel A, Lin J, Sharma A, Wang D, Zadeh R. WTF: The who to follow service at Twitter. In: Proc. of the 22Nd Int'l Conf. on World Wide Web (WWW 2013). New York: ACM Press, 2013. 505-514.
  • 6Fujiwara Y, Nakatsuji M, Onizuka M, Kitsuregawa M. Fast and exact top-k search for random walk with restart. Proc. of the VLDB Endowment, 2012,5(5):442-453. [doi: 10.14778/2140436.2140441].
  • 7Jeh G, Widom J. Scaling personalized Web search. In: Proc. of the 12th Int'l Conf. on World Wide Web (WWW 2003). New York: ACM Press, 2003. 271-279. [doi: 10.1145/775152.775191].
  • 8Jeh G, Widom J. SimRank: A measure of structural-context similarity. In: Proc. of the 8th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2002). New York: ACM Press, 2002. 536-543. Idol: 10.1145/775047.775126].
  • 9Sarkar P, Moore AW, Prakash A. Fast incremental proximity search in large graphs. In: Proc. of the 25th Internet Conf. on Machine Learning (ICML 2008). New York: ACM Press, 2008. 896-903. [doi: 10.1145/1390156.1390269].
  • 10Fujiwara Y, Nakatsuji M, Yamamuro M, Shiokawa H, Onizuka M. Efficient personalized PageRank with accuracy assurance. In: Proc. of the 18th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2012). New York: ACM Press, 2012. 15-23. Idol: 10.1145/2339530.233953g].

同被引文献63

  • 1方滨兴,郭云川,周渊.互联网信息内容安全的ICCON控制模型及评价[J].中国科学(F辑:信息科学),2009,39(9):951-965. 被引量:10
  • 2胡海波,王林.幂律分布研究简史[J].物理,2005,34(12):889-896. 被引量:87
  • 3谭跃进,吴俊,邓宏钟,朱大智.复杂网络抗毁性研究综述[J].系统工程,2006,24(10):1-5. 被引量:63
  • 4杜海峰,李树茁,W.F.Marcus,悦中山,杨绪松.小世界网络与无标度网络的社区结构研究[J].物理学报,2007,56(12):6886-6893. 被引量:72
  • 5Ma Shuai,Li Jia,Liu Xudong,et al.Graph query:new search in social computing era[J].Communications of China Computer Federation,2012,8(11):26-33.
  • 6Cui Wanyun,Xiao Yanghua,Wang Haixun,et al.Local search of communities in large graphs[C]//Proceedings of the2014 ACM SIGMOD International Conference on Management of Data,Snowbird,USA,Jun 22-27,2014.New York,NY,USA:ACM,2014:991-1002.
  • 7Gopalan P K,Blei D M.Efficient discovery of overlapping communities in massive networks[J].Proceedings of the National Academy of Sciences,2013,110(36):14534-14539.
  • 8Akiba T,Iwata Y,Yoshida Y.Dynamic and historical shortestpath distance queries on large evolving networks by pruned landmark labeling[C]//Proceedings of the 23rd International World Wide Web Conference,Seoul,Korea,Apr 7-11,2014.New York,NY,USA:ACM,2014:237-248.
  • 9Wei Hao,Xu Yu J,Lu Can,et al.Reachability querying:an independent permutation labeling approach[J].Proceedings of the VLDB Endowment,2014,7(12):1191-1202.
  • 10Zhu A D,Lin Wenqing,Wang Sibo,et al.Reachability queries on large dynamic graphs:a total order approach[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data,Snowbird,USA,Jun 22-27,2014.New York,NY,USA:ACM,2014:1323-1334.

引证文献3

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部