摘要
随着网络论坛、博客、微博的发展,引出社会网络中的用户排序问题。将在线网络论坛中用户映射为节点,用户评论过程中形成的回复关系映射为有向关联图,其节点度符合幂律分布。且论坛中用户的主题发布行为和回复关系符合Pagerank算法的互增强和随机游走特性,因此选用Pagerank算法排序用户影响力。该文提出的研究问题:如何提高用户排序应用中数据的存储和运行效率。天涯网络论坛中80%以上用户入度为0,据此,根据入度是否为0划分为两个集合,对入度为0集合按出度构造链接表,设计了基于集合划分的高效排序算法SD-Rank。SD-Rank时空复杂性为O(V′),V′为入度非0节点集。对天涯网络论坛真实用户数据的实验结果表明:SD-Rank算法时空复杂性优于Pagerank算法。
With the wide application of BBS,blog and micro blog etc,how to rank the user becomes a well-recognized research issue,especially in the social network area.The paper analyzes the users relational graph in network BBS,representing the correlation graph by users and replies between users,and reveals its power law distribution in in/out degree.Owing the fact that the user’s behavior of posting and replying is in accordance with Pagerank’s characteristics of random walking and mutual-enforcement,the user’s influence is ranked with Pagerank algorithm.This paper further addresses the issue of time-space ratio in computation resulted by the exponent growth of user’s number.Based on the observation that over 80 percent user’s indegree is 0.Using list structure designs the efficient set-division-ranking arithmetic(SD-Rank) is designed by via list structure after dividing users into two sets: 0-indegree in set0 and non-0-indegree in set1.Through set partitions according to degree distribution,the time-space complexity of SD-Rank is decreased from O(V+E) to O(V′),in which V′ is the size of set1.Experiment on TIANYA BBS dataset shows that SD-Rank is more efficient than Pagerank.
出处
《中文信息学报》
CSCD
北大核心
2012年第4期122-128,共7页
Journal of Chinese Information Processing
基金
国家863自然科学基金(2010AA012504)
国家973重点基础研究发展规划项目基金(G2011CB302605)
国家自然科学基金(61173145)
关键词
幂律
入度
集合划分
快速排序
power law
in degree
set division
quick rank