期刊文献+

基于概率生成模型的微博话题传播群体划分方法

Group Partition in Topic-related Microblogging Spreading Based on Probability Generation Model
下载PDF
导出
摘要 事件以话题形式在微博中迅速传播,并能够产生巨大的影响力。因此,对参与话题传播过程的用户进行分析以及发现具有不同主题兴趣情感倾向性的群体受到政府和企业的广泛关注。现阶段,绝大多数应用到微博的群体发现算法都是从单个用户出发,仅考虑了用户社会联系,与用户共享内容相隔离,其群体发现的结果不具有语义信息。少数算法综合了用户社会联系与内容,却忽略了微博本身的结构特性。因此从微博话题的角度出发,综合考虑话题传播过程中的用户交互、微博文本内容以及情感极性,同时结合用户的行为信息,提出了一个基于概率生成模型的微博话题传播群体划分方法 BP-STG。采用吉布斯抽样对模型进行推导,不仅能够挖掘出具有不同主题倾向性的群体,同时还能够挖掘出群体的情感倾向分布以及用户在群体中的活跃度及其行为表现。此外,模型还能够推广到许多带有社交网络性质的媒体中。在获取的新浪微博两个话题数据集上的实验表明,BP-STG模型不仅能够有效地对微博话题传播群体进行划分,而且能够发现群体内部活跃用户以及用户在群体中的行为模式。 Event can spread rapidly in the form of topic microblog and make enormous influence. Therefore, the analysis for the users and discovering groups with different interesting and sentiments in the topic discussion obtain the concern of the government and enterprises. The generated content and relationship between the users are often separated in the current methods on community detection, which have no semantic information. Though some methods have combined the two factors, they fail to take account of the behavior information and sentiment information which exist in microb- log,and they are not well to mine the groups in the microblog topic discussion. We proposed a group partition model called BP-STG which takes the text information, social contacts, text sentiment information and the users' behavior into consideration. We presented a Gibbs sampling implementation for inference of our model, mining only different interest groups, but also the sentiment distribution and participants~ activeness and behavior information in a group. Besides, our model can be extended to many texts associated with a group of people such as E-mails and forum posts. Experimental results on actual dataset show that BP-STG model can offer an effective solution to group partition in topic-related mi- croblogging spreading and provide more meaningful semantic information than the state-of-the-art model.
作者 陈静 刘琰 王煦中 CHEN Jing LIU Yan WANG Xu-zhong(State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China)
出处 《计算机科学》 CSCD 北大核心 2016年第8期223-228,239,共7页 Computer Science
基金 国家自然科学基金(61309007) 国家863计划(2012AA012902)资助
关键词 微博话题 概率生成模型 群体划分 情感元素 行为模式 Microblogging topic, Probability generation model, Groups partition, Sentiment information, Behavior pattern
  • 相关文献

参考文献13

  • 1Steyvers M, Smyth P, Rosen-Zvi M, et al. Probabilistic author- topic models for information discovery[C] // Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM,2004:306-315.
  • 2Zhou D, Manavoglu E, Li J, et al. Prohabilistic models for dis- coveringe-communities[C]//WWW. 2006 : 173-182.
  • 3Pathak N, DeLong C, Banerjee A, et al. Social topic models for community extraction[C]//The 2nd SNA-KDD Workshop. 2008.
  • 4Sachan M, Contractor D, Faruquie T, et al. Using content and interactions for discovering communities in social networks[C]/// WWW. 2012:331-340.
  • 5Yang T, J in R, Chi Y, et al. Combining ink and content {or com- munity detection: discriminative approach[C] ffKDD. 2009 : 927- 936.
  • 6Zhou W,Jin H, Liu Y. Community discovery and proihg with social messages[C]ffKDD. 2012:388-396.
  • 7Yang B, Manandhar S. Stc: A joint sentiment-topic model for community identification [M] //Trends and Applications in Knowledge Discovery and Data Mining. Springer International Publishing, 2014 : 535-548.
  • 8丁兆云,贾焰,周斌.微博数据挖掘研究综述[J].计算机研究与发展,2014,51(4):691-706. 被引量:118
  • 9Steyvers M, Griffiths T. Probabilistic topic models[J]. Hand- book of Latent Semantic Analysis, 2007,427 (7) : 424-440.
  • 10Chen Xiao-dong. Research and Sentiment Dictionary based Emo- tional Tendency Analysis of Chinese Microblog[D]. Wuhan: Huazhong University of Science Technology, 2012 (in Chi- nese).

二级参考文献112

  • 1Semiocast , Twitter reaches half a billion accounts more than 140 million in the U. S [EB/OL]. (2012-07-30)[2013-07- 23]. http://semiocast. com/publications/2012_07 _30_ Twitter_ reaches_halCa_billion_accounts_140m_in_the_ US.
  • 2Kwak H, Lee C, Park H, et al. What is Twitter, A social network or a news media [C] //Proc of the 19th Int Conf on World Wide Web (WWW·10). New York: ACM, 2010: 591-600.
  • 3Comscore. Mobile driving majority of growth for leading EU5 social networks [EB/OLJ. (2012-05-18) [2013-07- 23]. http://www.comscoredatamine.com/2012/05/mobile_ driving , majority _ f _ growth _ for _ leading _ eu5 _ social _ networks.
  • 4Sakaki T, Okazaki M, Matsuo Y. Earthquake shakes Twitter users: Real-time event detection by social sensors [C] //Proc of the 19th Int Conf on World Wide Web (WWW·10). New York: ACM, 2010: 851-860.
  • 5Popescu A M. Pennacchiotti M. Detecting controversial events from Twitter [C] !!Proc of the 19th ACM Int Conf on Information and Knowledge Management (CIKM·10). New York: ACM. 2010: 1873-1876.
  • 6Weng J. Lee B S. Event detection in Twitter [C] //Proc of the 5th Int AAAI Conf on Weblogs and Social Media (ICWSM'l1). Menlo Park. CA: AAAI. 2011: 401-408.
  • 7Becker H, Naaman M, Gravano L. Beyond trending topics: Real-world event identification on Twitter [C] //Proc of the 5th Int AAAI Conf on Weblogs and Social Media (lCWSM'l1). Menlo Park. CA: AAAI. 2011: 438-441.
  • 8Ritter A, Mausam B. Etzioni O. et al. Open domain event extraction from Twitter [C] //Proc of the 18th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining (KDD'12). New York: ACM. 2012: 1104-1112.
  • 9Lin J, Snow R, Morgan W. Smoothing techniques for adaptive online language models: Topic tracking in Tweet streams [C] //Proc of the 17th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining (KDD'll). New York: ACM, 2011: 422-429.
  • 10Hong L. Amr A, Gurumurthy S. et al. Discovering geographical topics in the Twitter stream [C] //Proc of the 21st Int Conf on World Wide Web (WWW'12). New York: ACM, 2012: 769-778.

共引文献117

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部