期刊文献+

基于多任务学习的微博流行度预测

Predicting Popularity Based on Multi-Task Learning on Twitter
原文传递
导出
摘要 以短文本内容发布为主要特点的微博,已经成为重要的信息传播媒介,预测微博流行度对舆情监测、企业营销、热点推送等都具有重要意义.当前对微博流行度预测的研究主要侧重于对所有用户的微博数据进行统一建模预测,鲜有研究考虑不同影响力用户之间的差异.而微博数据的分析显示标签、提及和微博长度等对微博流行度的影响会随发布者的影响力变化显示出明显差异,在流行度预测中充分考虑这些差异,有助于取得更好的预测结果.为此,在流行度预测中引入多任务学习(Multi-Task Learning,简称MTL),并结合SVM构建SVM+MTL模型,此模型通过同时考虑所有用户的共同特性和不同用户的具体特性来提高预测性能.此外,除了预测常用的用户属性和微博发布行为等特征外,还引入微博内容相似性这一新特征,该特征能明显提高预测准确率.基于微博数据的实验表明,SVM+MTL模型可以有效提高微博流行度预测性能. Micro-blog has become a new information media,and predicting popularity of micro-blog is of great significance to public opinion monitoring,company marketing and hot content recommendation.The current work related to the popularity prediction of micro-blog mainly focuses on building the unified model based on data of all users.However few studies consider the differences among users with different influence.Our analyses of micro-blog show characteristics of tweets(such as the presence of hashtags and mentions,as well as tweet length)exert different impacts on users with different influence levels for obtaining the click number.Therefore the predictive model should take into account these different impacts to achieve higher accuracy.To this end,we introduce the Multi-Task Learning(MTL),and build the SVM+MTL model to predict popularity of micro-blog.Specifically,we divide users into different groups based on their influence levels and treat prediction of each group as a task.The SVM+MTL model seeks to simultaneously learn the commonality as well as the differences between the multiple tasks.This model can improve the predictive performance by considering both the common properties of all users and specific characters of users with different influence levels.In addition,to further improve the predictive accuracy,we also explore a new feature about micro-blog content similarity,which is computed based on its similar posts.Here its similar posts refer to the top k similar posts from the same author and are selected by the Word Mover's Distance(WMD).Based on a large number of data from Twitter,the experiments show,compared with the models of Naive Bayes,Support Vector Machine(SVM),Logistic regression and J48 decision tree,the SVM+MTL model can effectively improve the predictive performance.
作者 韩凤娟 肖春静 王欢 HAN Fengjuan XIAO Chunjing WANG Huan(School of Computer and Information Engineering, Henan University, Henan Kaifeng 475004, Chin)
出处 《河南大学学报(自然科学版)》 CAS 2017年第5期544-551,共8页 Journal of Henan University:Natural Science
基金 国家自然科学基金资助项目(61402151) 河南省科技攻关计划(162102410010)
关键词 微博 流行度 预测 多任务学习 Twitter popularity prediction Multi-task learning
  • 相关文献

参考文献3

二级参考文献23

  • 1Bandari R,Asur S,Huberman B A.The Pulse of News in Social Media:Forecasting Popularity[C]//Proceedings of the 6th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2012:26-33.
  • 2Weng J,Lim E,Jiang J,et al.TwitterRank:Finding Topic-sensitive Influential Tw itterers[C]//Proceedings of International Conference on Web Search and Data Mining.New York,USA:ACM Press,2010:261-270.
  • 3Naveed N,Gottron T,Kunegis J,et al.Bad News Travel Fast:A Content-based Analysis of Interestingness on Tw itter[C]//Proceedings of the 3rd International Web Science Conference.New York,USA:ACM Press,2011:45-53.
  • 4Suh B,Hong L,Pirolli P,et al.Want to be Retweeted Large Scale Analytics on Factors Impacting Retw eet in Tw itter Netw ork[C]//Proceedings of the 2nd International Conference on Social Computing.Washington D.C.,USA:IEEE Press,2010:177-184.
  • 5Yang J,Counts S.Predicting the Speed,Scale,and Range of Information Diffusion in Tw itter[C]//Proceedings of the 4th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2010:355-358.
  • 6Szabo G,Huberman B.Predicting the Popularity of Online Content[J].Communications of the ACM,2010,53(8):80-88.
  • 7Petrovic S,Osborne M,Lavrenko V.RT to Win!Predicting Message Propagation in Tw itter[C]//Proceedings of the 5th International AAAI Conference on Weblogs and Social Media.Palo Alto,USA:AAAI Press,2011:586-589.
  • 8Hong Liangjie,Dan O,Davison B D.Predicting Popular Messages in Tw itter[C]//Proceedings of the 20th International Conference Companion on World Wide Web.New York,USA:ACM Press,2011:57-58.
  • 9许晓东,肖银涛,朱士瑞.微博社区的谣言传播仿真研究[J].计算机工程,2011,37(10):272-274. 被引量:55
  • 10张旸,路荣,杨青.微博客中转发行为的预测研究[J].中文信息学报,2012,26(4):109-114. 被引量:70

共引文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部