摘要
微博客是互动性的网络社交服务平台,由于微博传播机制、影响范围的不断扩大,已经成为当下非常便利的互联网用户交流信息方式。以新浪微博为数据获取平台,对含有同一微博标签的用户分析微博内容与用户标签的相关性,得到基于内容的用户排名方法。通过引入活跃度高的同主题服务团队,完成分词和文本分类分析,结合朴素贝叶斯算法和概率统计判定微博标签用户所发微博与设定标签的相关性,验证排名方法的可行性,优化标签用户的排名结果,从而进一步丰富目前的微博服务。
As an interactive social networking platform, Sina Weibo has been expanding influence continually due to its dissemination mechanisms, and has been a convenient way for internet users to share information. This paper, with Sina Weibo used as the source of da-ta acquisition, analyzed contents and user tags correlations between users containing the same Microblog labels, thus obtained a Microblog content based user-ranking method. By introducing a highly active service team with the same theme for word segmentation and text clas-sification analysis, combined with Bayesian algorithms and probability and statistical methods to determine the correlation between Microb-log content by user label and the set labels, this paper verified the feasibility of the user-ranking method, optimized user-ranking results, so as to further enrich the current Micro-blogging service.
出处
《情报杂志》
CSSCI
北大核心
2014年第3期122-127,149,共7页
Journal of Intelligence
基金
教育部人文社会科学规划项目“基于微博客的危机事件群体(编号:11YJA870017)
对外经济贸易大学研究生科研创新基金资助“基于微博数据的服务发现研究”(编号:A2012029)
关键词
微博
用户排名
内容分析
朴素贝叶斯
文本分类
Microblog
user-ranking
content analysis
Naive Bayes
text classification