期刊文献+

基于LDA的社会化标签综合聚类方法 被引量:14

A Comprehensive Clustering Method of Social Tags Based on LDA
下载PDF
导出
摘要 社会化标注系统产生了大量歧义的、不受控制的标签,不仅会降低用户的体验,而且会限制资源的利用效率。标签聚类能够把具有相近语义的标签聚集在一起,反映标签的潜在语义结构,从而有效缓解上述问题。传统的标签聚类方法通常只利用资源的被标注信息进行聚类,由于忽略了用户的标注信息使得聚类结果不能表达准确的语义。本文提出一种基于LDA(Latent Dirichlet Allocation)模型的社会化标签综合聚类方法,该方法分别利用用户的标注信息和资源的被标注信息来建立主题学习模型,通过学习,获取基于用户的标签潜在主题和基于资源的标签潜在主题,综合标签在这两类主题上的概率分布结果,建立标签主题的二次学习模型,学习出标签的混合主题并在此基础上判定标签的聚类簇。与传统方法相比,本文的方法不仅可以有效地利用标签之间的语义关系,而且能够在一定程度上缓解传统标签聚类方法所面临的高维和稀疏性问题。实验结果表明,本文的方法具有较好的效果。 Social tagging systemsproduces plenty of ambiguous and uncontrolled tags. These tags not only worsen users' experience but also restrict resource's retrieving efficiency. Tag clustering could aggregate tags with similar semantics together, and help alleviate the above problems. Traditional tag clustering methods usually utilize the resource's annotated information to aggregate tags. But their clustering results cannot address accurate semantics because these methods do not consider the user's annotating information. In this paper, we propose a social tag comprehensive clustering method based on LDA model. We first utilize the user's annotating information and the resource's annotated information to construct two LDA topic learning models respectively. The two LDA models are user-based tag topic model and resource-based tag topic model. Then, the re-learning model of tag topic is constructed by compositing the tag's probability distribution results on user-based tag latent topics and resource-based tag latent topics. In this environment, the mixture topics of tags will begenerated by iterative learning. Finally, the cluster of tags will be decided according to their maximum probability on topics. Compared with traditional tag clustering methods, our method utilizes the semantic relation of tags effectively, and mitigates the high-dimensional and sparse problems faced by traditional methods to some extent. Experimental results show that the proposed method has a better effect.
出处 《情报学报》 CSSCI 北大核心 2015年第2期146-155,共10页 Journal of the China Society for Scientific and Technical Information
基金 国家自然科学基金项目(61273292,61303131,51374114,51474007)资助 教育部人文社会科学研究青年基金项目“社会化标注环境下的标签层次关系发现方法研究”(13YJCZH077)资助
关键词 社会化标注系统 标签聚类 潜在语义 主题模型 social tagging system, tag clustering, latent semantics, topic model
  • 相关文献

参考文献16

  • 1Isabella P. Folksonomies: Indexing and retrieval in Web 2.0 [ M ]. Berlin : De Gruyter Suar, 2009.
  • 2Shepitsen A,Gemmell J, Mobasher B, et al. Personalized recommendation in social tagging systems using hierarchical clustering [ C ]// Proceedings of the 2008 ACM Conference on Recommender Systems, New York , United States,2008:259-266.
  • 3Gemmell J,Shepitsen A,Mobasher B, et al. Personalized navigation in social tagging systems using hierarchical tag clustering[ C ]// Proceedings of the 10th International conference on Data Warehousing and Knowledge Discovery. Springer-Verlag, Berlin, Heidelberg,2008:196-205.
  • 4易明,操玉杰,沈劲枝,毛进.社会化标签系统中基于密度聚类的Web用户兴趣建模方法[J].情报学报,2011,30(1):37-43. 被引量:18
  • 5Xu G D, Zong Y, Jin P, et al. KIPTC : a kernel information propagation tag clustering algorithm [ J/OL ]. Journal of Intelligent Information Systems, http://link, springer. corn/article/10. 1007/s10844-013-0262-7/fulltext. html.
  • 6Dattolo A, Eynard D, Mazzola L. An integrated approach to discover tag semantics [ C ]// Proceedings of the 2011 ACM symposium on applied computing, TaiChung, Taiwan, China, 2011 : 814-820.
  • 7Knautz K, Soubusta S, Stock W G. Tag clusters as information retrieval interfaces [ C ]// Proceedings of the 43rd Hawaii International Conference on System Sciences, Hawaii, United States ,2010 : 1-10.
  • 8Simpson E. Clustering tags in enterprise and web folksonomies [EB/OL]. [ 2008-03-31 ]. http://www, hpl. hp. corn/ teehreports/2008/HPL-2008-18, pdf.
  • 9Begclman G, Keller P, Smadja F. Automated tag clustering : Improving search and exploration in the tag space [ C ]// Proceedings of Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland,2006 : 15-33.
  • 10Cui J W, Liu H Y, He J, et al. Tagclus: a random walk-based method for tag clustering [ J]. Knowledge and Information Systems, 2011,27(2) : 193-225.

二级参考文献39

共引文献69

同被引文献148

引证文献14

二级引证文献78

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部