期刊文献+

基于动态先验的LDA模型消息传递算法

BELIEF PROPAGATION OF LATENT DIRICHLET ALLOCATION BASED ON DYNAMIC PRIORI PARAMETERS
下载PDF
导出
摘要 变分贝叶斯、吉布斯采样和消息传递是求解潜在狄利克雷分配(LDA)模型的三种主要近似推理算法,消息传递算法在效率和准确率上都明显优于其他两种。为了获得高可解释性的潜在语义空间,提出在迭代过程中动态调整先验参数的消息传递算法,使用加入伽马先验的固定点迭代方法自动学参数,同时探索对称先验以及非对称先验对模型泛化能力及文本分类准确率的影响。实验结果表明提出的动态非对称先验算法改进了模型的泛化能力,提高了文本分类的准确率。 There are three main approximate inference methods to seek the solution of latent Dirichlet allocation( LDA) model: the variational Bayes,the Gibbs sampling and the belief propagation. Belief propagation algorithm is obviously competitive in both efficiency and accuracy to other two. For finding the latent semantic space with high interpretability,this paper proposes the belief propagation algorithm which dynamically adjusts priori parameters during iterations. It automatically learns the parameters by the fixed point iteration method with Gamma priori added. Meanwhile,we explore the effect of symmetric priori and asymmetric priori on the generalisation ability of model and the accuracy of text classification. Experimental results show that the proposed dynamic asymmetric priori algorithm improves the generalisation ability of model as well as raises the accuracy of text classification.
出处 《计算机应用与软件》 CSCD 2015年第8期220-223,275,共5页 Computer Applications and Software
基金 国家自然科学基金项目(61003154 61373092 61033013 61272449 61202029) 江苏省高校自然科学研究项目(11KJB520018) 江苏省教育厅重大项目(12KJA520004) 苏州大学创新团队项目(SDT2012B02) 广东省重点实验室开放课题(SZU-GDPHPCL-2012-09)
关键词 LDA 消息传递算法 对称先验 非对称先验 LDA Belief propagation algorithm Symmetric priori Asymmetric priori
  • 相关文献

参考文献12

  • 1杨潇,马军,杨同峰,杜言琦,邵海敏.主题模型LDA的多文档自动文摘[J].智能系统学报,2010,5(2):169-176. 被引量:23
  • 2王朝飞,王凯.主题模型在数字图书馆Web服务中的应用[J].情报理论与实践,2010,33(2):118-120. 被引量:4
  • 3Dacid M Blei,Andrew N G,Michael Jordan. F^itent DirirhlH allocation[J]. Journal of Machine I>eaming Re.search ,2003 ,3(1) :993 — 1022.
  • 4Thomas L Griffiths, Mark Steyvers. Finding scientific topics [ J ]. Pro-ceedings of the National Academy of Sciences, 2004, 101 ( Suppl. 1 ):5228-5235.
  • 5Zeng Jia,William K Cheung,Liu Jiming. learning topic models by be-lief Propagation [ J ]. IEKE Transactions on Pattern Analysis and Ma-chine Intelligence,2013,33(5) :1121 - 1134.
  • 6Minka T P. Estimating a Dirichlet distribution [ KB/OI.]. ( 2005-8 )[2013-12 ] . http://research. micrx)soft. corn/en-ua/um/people/min-ka/ papers/ dirichlet/.
  • 7Huang J. Maximum likelihood estimation of Diric hlet distributions[ J ].Journal of Statistical Computation and Simulation, 2004, 32 ( 5 ) : 215-221.
  • 8Hanna A M,Wallach M,David M Mimno. Rethinking Ida : Why priorsmatter: Annual Conference on Neural Information Processing Systems[C] . Vancouver:Curran Associates,2009 : 1973 - 1981.
  • 9Ding C H. A probabilistic model for latent semantic indexing[ J]. Jour-nal of the American Society for Information Science and Technology,2005,56(6);597-608.
  • 10徐戈,王厚峰.自然语言处理中主题模型的发展[J].计算机学报,2011,34(8):1423-1436. 被引量:233

二级参考文献86

  • 1秦兵,刘挺,李生.基于局部主题判定与抽取的多文档文摘技术[J].自动化学报,2004,30(6):905-910. 被引量:10
  • 2熊朝松,甘岚.基于子主题概念的Web主题挖掘[J].计算机与现代化,2006(4):63-65. 被引量:1
  • 3RADEV D R,HOVY E,MCKEOWN K.Introduction to the special issue on text summarization[J].Computational Linguistics,2002,28(4):399-408.
  • 4LEE J H,SUN P,AHN C M,et al.Automatic generic document summarization based on non-negative matrix factorization[J].Information Processing and Management,2009,45(1):20-34.
  • 5HIRAO T,ISOZAKI H,MAEDA E,et al.Extracting important sentences with support vector machines[C]//Proc of the 19th International Conference on Computational Linguistics.Taipei,China,2002:1-7.
  • 6NENKOVA A,VANDERWENDE L.The impact of frequency on summarization:MSR-TR-2005-101[R].Redmond,USA:Microsoft Research,2005.
  • 7LINC Y,HOVY E.The automated acquisition of topic signatures FOR text summarization[C]//Proc of the 18th International Conference on Computational Linguistics.Sarbrflcken,Germany,2000:271-278.
  • 8ANTIQUEIRA L,Jr OLIVEIRA O N.A complex network approach to text summarization[J].Information Science,2009 (179):584-599.
  • 9WAN X J,YANG J W.Multi-document summarization using cluster-based link analysis[C]//Proc of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Sheffield,UK,2008:299-306.
  • 10HARABAGIU S,HICKL A,LACATUSU F.Satisfying information needs with multidocument summaries[J].Information Processing and Management,2007,43(6):1619-1642.

共引文献250

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部