期刊文献+

基于XGBoost的微博流行度预测算法 被引量:3

Microblog Popularity Prediction Algorithm Based on XGBoost
下载PDF
导出
摘要 随着全媒体时代的到来和社交网络的发展,流行度预测在舆情监测和数据话语权的争夺上开始发挥重要的作用。现有的流行度预测研究多集中于外文媒体,对以微博为代表的国内主流媒体进行流行度预测是一个新兴且具有挑战的方向。本文针对微博这一国内社交媒体平台进行研究,通过对微博内容及微博用户的特征分析,设计了多种流行度预测方案,同时,提出了一种基于XGBoost的微博流行度预测算法,将流行度预测问题转换为互动值档位分类问题,在分类式框架下将提取融合后的特征用于模型训练,可以较为准确地对有用户信息的微博的流行度情况进行预测。本文的算法在微博流行度预测数据集中得到验证,并且取得了准确率高达85.69%的优越效果。 With the advent of the all-media era and the development of social networks,the popularity prediction begins to play an important role in the monitoring of public opinion and the competition of data discourse power.The existing popularity prediction researches mostly focuse on foreign media,and it is an emerging and challenging direction to predict the popularity of domestic mainstream media such as microblog.In this paper,we conduct the research on microblog,a domestic social media platform,through the analysis of microblog’s content and users,and design a variety of popularity prediction schemes.Meanwhile,we propose a microblog popularity prediction algorithm based on XGBoost,which converts the popluarity prediction problem into an interactive value file classification problem,and use the extracted and fused features for model training under the categorical framework,which can predict the popularity of microblog with user information more accurately.The proposed algorithm is verified in the microblog popularity prediction dataset,whose accuracy rate can achieve as high as 85.69%.
作者 任敏捷 靳国庆 王晓雯 陈睿东 袁运新 聂为之 刘安安 REN Minjie;JIN Guoqing;WANG Xiaowen;CHEN Ruidong;YUAN Yunxin;NIE Weizhi;LIU An’an(State Key Laboratory of Communication Content Cognition,People’s Daily Online,Beijing 100733,China;School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
出处 《数据采集与处理》 CSCD 北大核心 2022年第2期383-395,共13页 Journal of Data Acquisition and Processing
基金 传播内容认知国家重点实验室开放基金(20K04)。
关键词 社交媒体预测 XGBoost 特征提取 特征融合 微博流行度 social media prediction XGBoost feature extraction feature fusion microblog popularity
  • 相关文献

参考文献4

二级参考文献30

  • 1R. Lahan. The Economics of Attention[M]. Univer- sity of Chicago Press, 2006.
  • 2Pete Cashmore. YouTube: Why Do We Watch? [DB/ OL]. http://editin, cnn. com/2009/TECH/12/17/ cashmore, youtube/ index, html, 2010.
  • 3J. Berger, K. L. Milkman. Social Transmission, E- motion, and the Virality of Online Content[R]. Whar- ton Research Paper, 2010.
  • 4A. Tumaslan, T. O. Sprenger, P. G. Sanduer, et al. Predicting Elections with Twitter: What 140 Charac- ters Reveal about Political Sentiment[C]//Proceedings of the 4th International AAAI Conference on Weblogs and Social Media. ICWSM,10, 2010.
  • 5J. Bollen, H. Mao, A. Pepe. Determining the public mood state by analysis of mieroblogging posts[C]// Proceedings of the Alife XII Conference MIT Press, 2010.
  • 6T. Sakaki, M. Okazaki, Y. Matsuo. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors[C]//Proceedings of WWW,10, 2010.
  • 7Y. Qu, C. Huang, P. Zhang, et al. Microblogging after a Major Disaster in China: A Case Study of the Yushu Earthquake [C]//Proeeedings of CSCW2011, 2011.
  • 8H. Kwak, C. Lee,H. Park, et al. What is Twitter, a ,Social Network or a News Media[C]//Proceedings of WWW'IO, 2010.
  • 9D. Boyd, S. Golder, G. Lotan. Tweet, tweet, retweet: Conversational aspects of retweeting on Twit- ter [ C]//Proceedings of 43rd Hawaii International Conference on System Sciences, 2010.
  • 10B. Suh, L. Hong, P. Pirolli, et al. Want to be Retweeted? Large Scale Analytics on Factors Impac- ting Retweet in Twitter Network[C]//Proceedings of IEEE 2nd International Conference on Social Compu- ting (SocialCom), IEEE. 2010:177-184.

共引文献85

同被引文献37

引证文献3

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部