期刊文献+

新浪微博话题流行度预测技术研究 被引量:7

Predicting Popularity of Tweets on Sina Weibo
下载PDF
导出
摘要 微博作为一种新的在线社会网形式,逐渐成为人们获取和共享信息的重要平台。以我国最大的微博网站——新浪微博为对象,重点研究了微博话题的流行度预测问题。收集了大约40G的微博话题信息作为研究数据集,从中提取出与话题流行度相关的微博用户属性和话题内容属性,在对这些属性相关性分析的基础上,提出了一种兼顾用户属性和内容属性的话题流行度定量描述方法。文章对影响话题流行度的各属性进行了详细的主成分分析,总结出4种属性作为话题流行度预测的依据,并建立了流行度的线性预测模型。该模型能较好地预测话题流行度,模型指标R2达到0.89。 The two-year old Sina weibo is the most famous micro-blogging platform in China. The goal of this paper is to predict the popularity of a newly submitted tweet timely and accurately. By analyzing the correlations of each feature of the user and tweet content, a quantitative description of tweet' s popularity is presented. Principle components analysis is used to reduce the feature dimen- sions by performing a covariance analysis between factors that affect tweet' s popularity, and some most important features are extracted. Then, a PCA-based linear predicating model to predict the popularity of a newly submitted tweet is built. A validation is made on Sina micro-blogging network. The result shows that the model works well on predicting the popularity of a new tweet, and the eval-uation index R^2 reaches 0.89.
出处 《信息工程大学学报》 2012年第4期496-502,共7页 Journal of Information Engineering University
基金 国家重点实验室开放课题资助项目(SKLSDE-2011KF-06)
关键词 微博 话题流行度 预测 主成分分析 microblogging popularity of tweet prediction principle components analysis
  • 相关文献

参考文献14

  • 1Dejin Zhao, Mary B, Rosson. How and why people twitter: the role that micro-blogging plays in informal communication at work[ C ]//Proceedings of the ACM 2009 international conference on Supporting group work. 2009:243-252.
  • 2Boyd d, Golder S, Lotan, et al. Tweet, Retweet: Conversational Aspects of Retweeting on Twitter[ EB/OL]. [2010-04-22]. http ://research. microsoft, com/pubs/135165/TweetTweet-Retweet. 2010.
  • 3Zarrella D. Science of Retweets[ EB/OL]. [ 2009-08-23 ]. http://danzarrella, com/the-science-of-retweets-report, html.
  • 4Krishnamurthy B, Gill P, Arlitt M. A few chirps about twitter[ EB/OL]. [2008-10-28]. http://dl, acre. org/citation, cfm? id = 1397741.
  • 5Kwak H, Lee C, Park H, et al. What is twitter, a social network or a news media [ EB/OL]. [2010-02-24]. http://dl. aem. org/citation, cfm? id = 1772751.
  • 6Cha M, Haddadi H, Benevenuto F, et al. Measuring user influence on twitter: The million follower fallacy[ C]//4th Int'l AAAI Conference on Weblogs and Social Media. 2010:113-123.
  • 7Weng J, Lim E P, Jiang J,et al. Twitterrank : finding topic-sensitive influential twitters[ EB/OL]. [ 2010-06-28 ] http ://dl. acm. org/citation, cfm? id = 1718520.
  • 8Lerman K, Ghosh R. Information contagion: an empirical study of the spread of news on digg and twitter social networks[ EB/ OL]. [ 2010-08-12 ]. http ://arxiv. org/abs/1003. 2664.
  • 9Eytan Bakshy, Jake M, Hofman. Identifying ' Influencers' on Twitter [ EB/OL ]. [ 2011-01-12]. http://kdpaine, blogs. com/files/twitterinfluencershofmanetal. 2011.
  • 10Szabo G, Huberman B A. Predicting the popularity of online content[ EB/OL]. [ 2008-05-11 ]. http ://dl. acm. org/citation. cfm? id = 1787254, 2008.

同被引文献54

  • 1解(亻刍),汪小帆.复杂网络中的社团结构分析算法研究综述[J].复杂系统与复杂性科学,2005,2(3):1-12. 被引量:86
  • 2杨雪.浅析数据挖掘技术[J].华南金融电脑,2005,13(8):83-85. 被引量:6
  • 3HanJ KamberM 数据挖掘 范明 孟小峰 译.概念与技术[M].北京:机械工业出版社,2001..
  • 4Lauw H, Sharer JC, Agrawal IL Homophily in the digital world: a live journal case study. IEEE Intcrnet Computing, 2010,14(2):15-23.
  • 5Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of Machine Learning Research, 2003(3):993-1022.
  • 6Griliiths TL, Steyvers M. Finding scientific topics. Proc. National Academy of Science of United States of America. 2004,(101):5228-5235.
  • 7Weng J, Lim EP, Jiang J, He Q. Twitter Rank: Finding topic- sensitive influcential twitters. Proc.of the 3rd ACM WSDM., 2010.
  • 8Steyvers M, Smyth P, Rosen-Zvi M, Crriffiths T. Probabilistic author-topic models for info.,:mation discovery.SIGKDD,2010.
  • 9Blei DM, Lafferty JD. Dynamic topic models. ICML,2006.
  • 10傅志华.数据:2010微博与社区调查.北京:DCCI互联网数据中心,2010.

引证文献7

二级引证文献29

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部