摘要
随着社会媒体的发展,许多在线网络产生大量内容,发现其潜在的结构便于人们了解网络的功能,进行更深层次的分析和预测.社区和主题是网络结构发现的两个重要依据,其分别利用网络链接和内容建模,但链接的稀疏和内容的不相关导致发现难以解释的社区和不准确的主题.融合内容和链接的概率模型成为解决此问题的主流方法,按目标不同将其分为主题发现、主题社区发现和社区-主题发现模型,分析典型模型的设计背景、基本原理及求解方法,并通过定性比较和实验分析探索其存在的问题,最后预测未来融合模型的可能研究方向.
With the development of social media, many online networks generate a lot of contents. Discovering their latent structures can help us understand their functions, analyze and predict them deeply. Community and topic are two important bases for structure exploring, which respectively makes use of network's links and contents to model. But the results from community detection are unin-telligibly due to the sparse links, and the ones from topic identification are inaccuracy because of the irrelevant contents. In order to resolve these problems, the probabilistic models combining content and link become prevalent, which are classified as topic identifica-tion models, topic community detection models and community-topic detection models. The article analyzes each classical model's designing background, keystone and solving method, and explores their existing problems through the qualitative comparisons and ex-periment analysis. In the end the research progress about the combining models is predicted.
出处
《小型微型计算机系统》
CSCD
北大核心
2013年第11期2524-2528,共5页
Journal of Chinese Computer Systems
基金
中央高校基本科研业务费专项资金(2012YJS027)资助
北京市自然科学基金项目(4112046)资助
河北省科技技术厅项目(11213584)资助
河北省自然科学基金项目(F2008000204)资助
关键词
内容网络
主题模型
社区发现
社区-主题分布
text-augmented network
topic model
community detection
community-topic distribution