期刊文献+

基于改进LDA的社会化标签主题识别方法

The Topic Recognition Method of Socialized Tags Based on Improved LDA
下载PDF
导出
摘要 针对社会化标签中资源之间存在独立同分布特性,并且其对应的标签资源作为资源内容的特殊语义内容,提出一种联合特征词加权-LDA(Joint Feature Word Weighting-LDA)在资源内容和标签下联合主题识别方法,从而解决资源存在的独立同分布特性以及特征词采样等问题。首先建立评论及对应标签资源在信息熵相似度条件下的潜在关系,对该潜在关系使用随机游走方法获取各组资源和各组标签的权值系数,消除资源间的独立同分布。通过加权方法加权至每个资源的特征词,形成资源特征词和标签特征词的权重值系数。在此基础上构建联合特征词加权-LDA模型,通过迭代学习方法获取社会化标签资源的隐含主题知识。通过实验表明,提出的联合特征词加权-LDA相对于其他主题模型具有更好的主题识别效果。 Aiming at the independent and identical distribution characteristics of resources in socialized tags,and the corresponding tag resources as the special semantic content of the resource content,a Joint Feature Word Weighting-LDA joint topic identification method under the resource content and tags is proposed to solve the problems of the independent and identical distribution characteristics and feature word sampling existingintheresources.Firstly,the potential relationship between comments and corresponding tag resources under the condition of information entropy similarity was establishedto obtain the weight coefficients of each group of resources and each group of tagsand eliminate the independent and identical distribution among theresourcesbyusing the random walk method.The feature words of each resource were weighted by a weighting method to form the weight value coefficients of the resource feature words and the tag feature words,basedonwhich the Joint Feature Word Weighting-LDA model was constructed and the implicit topic knowledge of social tags resources obtained through iterative learning methods.Experiments show that the Joint Feature Word Weighting-LDA proposed in this paper has a better topic recognition effect than other topic models.
作者 邰悦 葛斌 李慧宗 TAI Yue;GE Bin;LI Huizong(School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan Anhui,232001,China;School of Computer Science and Technology,Nanyang Normal University,Nanyang Henan,473061,China)
出处 《安徽理工大学学报(自然科学版)》 CAS 2021年第5期55-63,共9页 Journal of Anhui University of Science and Technology:Natural Science
基金 国家自然科学基金资助项目(51874003,61703005) 教育部人文社会科学研究青年基金资助项目(13YJCZH077) 安徽省自然科学基金资助项目(1808085MG221)。
关键词 社会化标签 信息熵相似度 独立同分布 加权方法 潜在狄利克雷分布(LDA) socialized tag information entropy similarity independent and identical distribution weighting method Latent Dirichlet Allocation(LDA)
  • 相关文献

参考文献5

二级参考文献51

  • 1Blei D,Ng A,Jordan M.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003:3,993-1022.
  • 2Griffiths T L,Steyvers M.A Probabilistic Approach to Semantic Representation[C]∥ Proceedings of the 24th Annual Conference of the Cognitive Science Society,2002:381-386.
  • 3Griffiths T L,Steyvers M.Prediction and Semantic Association[C]∥ Advances in Neural Information Processing Systems,2003,15:11-18.
  • 4Griffiths T L,Steyvers M.Finding Scientific Topics[C]∥ Proceedings of the National Academy of Science,2004:5228-5235.
  • 5Hofmann T.Probabilistic Latent Semantic Analysis[C]∥ Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence.1999:289-296.
  • 6Deerwester S,Dumais S,Furnas G,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Science,1990,41:391-407.
  • 7Hofmann T.Unsupervised Learning by Probabilistic Latent Semantic Analysis[J].Machine Learning Journal,2001,42(1):177-196.
  • 8Blei D,Lafferty J.Correlated Topic Models[C]∥ Advances in Neural Information Processing Systems,2006,18:147-154.
  • 9Blei D,Griffiths T,Jordan M,et al.Hierarchical Topic Models and the Nested Chinese Restaurant[C]∥ Advances in Neural Information Processing Systems,2004,16:17-24.
  • 10Li W,McCallum A.Pachinko Allocation:DAG-Structured Mixture Models of Topic Correlations[C]∥ 23th International Conference on Machine Learning,2006:577-584.

共引文献100

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部