摘要
话题演化研究有助于追踪用户的喜好和话题的发展趋势,对于舆情预警具有重要意义。目前,话题演化方法注重运用话题生成模型实现话题演化分析,忽略了话题中时间因素和背景词的存在。以传统话题生成模型LDA为基础,将其扩展为微博话题生成模型MTLDA。MTLDA模型增加了对背景词的考虑,提高了话题生成的效率,同时对微博话题集进行时间片划分,利用KL距离计算相邻时间片话题距离,分析话题演化情况。以新浪微博数据为例进行实验,结果表明,MTLDA模型通过时间片划分完成了微博话题的生成,话题演化结果与实际情况吻合。
Topic evolution research is helpful to track the user preferences and development trend of topics,and it is of great significance for public sentiment warning.Current topic evolution methods focus on using topic generation model to achieve the topic evolution analysis,and ignore the time factors of topic and background word.Based on the traditional topic generation model LDA,this paper extended it to the micro-blog topic generation model MTLDA.Considering the background word,MTLDA model improves the efficiency of the topic generation.Meanwhile,the micro-blog topic set is divided into time slices,KL divergence is used to calculate the distance between adjacent time slices,and topic evolution is analyzed.Taking Sina Micro-blog data as an example,the experimental results show that the MTLDA model completes the generation of micro-blog topic by using the time slice,and the topic evolution results are tally with the actual situation.
作者
王振飞
刘凯莉
郑志蕴
王飞
WANG Zhen -fei LIU Kai-li ZHENG Zhi-yun WANG Fei(School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China)
出处
《计算机科学》
CSCD
北大核心
2017年第8期270-273,279,共5页
Computer Science
基金
郑州大学新媒体公共传播学科招标课题阶段性成果(XMTGGCBJSZ11)
河南省科技攻关项目(142102310531)资助
关键词
微博
话题演化
社交网络
MTLDA模型
KL距离
Microblog
Topic evolution
Social network
MTLDA model
Kullback Leibler(KL) divergence