摘要
提出了一种基于主题模型的微博社区发现方法。该方法采用狄利克雷过程(Dirichlet process)自适应生成多个潜在地理区域;利用多项式分布描述主题在连续时间中的强度;将用户对潜在地理区域和社区的选择偏好引入主题模型;最后通过EM方法和Gibbs采样,实现时空主题模型参数估算,以基于主题相似性进行社区发现。实验表明,该方法能更加准确地识别微博社区。
This paper presents a novel micro-community detection method based on topic model. Multiple latent geographical regions by Dirichlet process are created adaptively. A multinomial distribution is then employed to depict topics evolutions within each time bin. User selection preferences of latent geographical region and community are introduced in topic model. Finally, the EM method and Gibbs sampling method are used to estimate spatio-temporal topic model parameters so that community detection can be realized by topics similarity. Experiment results show that this method can promote the performances of community identifying.
出处
《电子科技大学学报》
EI
CAS
CSCD
北大核心
2014年第3期464-469,共6页
Journal of University of Electronic Science and Technology of China
基金
国家863计划(2013AA12A203)
国家自然科学基金(41361022)
关键词
狄利克雷过程
地理标识微博
微博社区发现
微博主题挖掘
时空主题模型
Dirichlet process
geo-tagged microblog
microblog community detection
microblog topic mining
spatio-temporal topic model