摘要
微博热点话题发现对于舆情分析和观点挖掘具有重要作用,提出了一种基于热度联合排序的微博热点话题发现方法,并构建统一的模型框架将微博文本和热点主题词之间的各种关系进行了有机融合;考虑微博用户的权威性以及主题词的时间段相关特性,对微博文本和主题词的热度进行了联合排序并互相增强;使用主题词组合支持度作为阈值对热度序列中的主题词进行聚类以表征热点话题。实验表明,所提方法对于热点主题词的抽取以及热点话题发现具有较高准确性,可以及时有效地发现特定时间段内的微博热点话题。
Micro-blog hot topic detection plays an important role in public opinion analysis and opinion mining. In order to reduce the impact of data sparsity on topic detection, this paper proposes an approach for micro- blog hot topic detection based on heat co-ranking, builds a unified model framework to organically integrate all relationships between micro-blog texts and topic keywords. The authority of micro-blog user and the time-related characteristics of topic keywords are simultaneously considered, and the heat of micro-blog texts and topic keywords gets mutual reinforcement and co-ranking. Topic keywords in hot sequence are clustered by using the combination support confidence as a threshold. The experimental results show that the proposed method has high accuracy for hot keywords extraction and hot topic detection, can effectively discover micro-blog hot topics in a specific period.
出处
《计算机科学与探索》
CSCD
北大核心
2016年第4期573-581,共9页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.61373148
国家社科基金No.12BXW040
山东省优秀中青年科学家奖励基金No.BS2013DX033
山东省自然科学基金No.ZR2012FM038
教育部人文社科基金No.14YJC860042
山东省社科规划项目No.12BXWJ01~~