摘要
归纳出对在线社交网络研究具有挑战性的一些课题,介绍描述用户关系的逻辑模型(粉丝模型),提出逻辑关系寓意邻接矩阵(粉丝矩阵)。用此模型展示对微博平台Top-X信息查询的聚合-排序-删除算法。进一步应用映射和化简概念将上述Top-X信息查询算法扩展于并行计算环境,给出映射关注和化简粉丝在Ha-doop系统联机实现的算法。粉丝模型和相应的算法实现了对新浪微博74.7GB和Twitter的101GB实际数据的多种约束下信息查询和微博转发预测,特别是在Hadoop系统联机环境下,新方法的信息化简和计算性能明显提高。
Based on a thorough literature review in the related field, this paper presents some meaningful but challenging research topics in OSNs. The logical model (Follow Model) is intro- duced. To present the basic relationships between the users, the Relationship Committed Adja- cency Matrix (Follow Matrix) is put forward. Then applying this logical model to show its effect, the Aggregation-Ranking-Delete algorithm is presented to rank the Top-X in OSNs. The paper further puts the new way of computing, combining the concept of MapReduce, into the parallel querying, which further leads to Map Followee and Reduce Follower algorithm imple- mented in Hadoop system. Follow Model and related algorithms are applied with the data collect- ed from Sina Weibo (74.7 GB) and Twitter (101 GB) for the multi-constraint querying and retweeting prediction. The results demonstrate that the new solution with parallel paradigms in Hadoop has significantly improved the effect with the information storage adequately reduced and the computing power greatly increased.
出处
《复杂系统与复杂性科学》
EI
CSCD
北大核心
2013年第2期77-87,共11页
Complex Systems and Complexity Science
基金
巴西科学技术发展委员会(CNPq
304058/2010-6
478039/2012-3)
关键词
复杂网络
平行算法
微博
信息查询
映射和化简
在线社交网络
complex system
parallel computing
Micro-blog
information query
MapReduce
online social networks