期刊文献+

自适应主题融合的多文档自动摘要算法

Adaptive multi-document summarization algorithm based on fusion topic model
下载PDF
导出
摘要 在LDA主题模型的基础上,提出自适应主题融合的多文档自动摘要算法。考虑到标题信息对摘要形成有很强的指示作用,为文档的标题和正文内容分别建立主题模型,并对2个模型进行融合。融合过程中,根据2种形态的信息熵进行自适应不对称学习,从而对不同形态的主题分布进行加权处理。融合后的模型适当地关联了标题和正文的信息,因此能够有助于摘要质量的提高。实验结果表明:自适应主题融合的多文档自动摘要算法在DUC2002标准数据集上取得了较好的效果。 Based on the LDA topic model,a novel multi-document summarization algorithm was proposed based on the adaptive fusion topic model.Considering the strong indication effect of the title cast on forming the summarization,corresponding topic model for the title and content of each document was established.In the fusing stage,the algorithm can learn the weight in an adaptive asymmetric learning way based on two kinds of information entropies.In this way,the final model incorporates the title information and the content information appropriately,which helps to improve the performance of summarization process.The experimental results show that the proposed algorithm achieves better performance on DUC2002 datasets.
出处 《中南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2013年第S2期205-209,共5页 Journal of Central South University:Science and Technology
基金 国家自然科学基金资助项目(61073133 61175053 61272369)
关键词 多文档摘要 主题模型 自适应学习 信息熵 multi-document summarization topic model adaptive learning information entropies
  • 相关文献

参考文献14

  • 1张晨逸,孙建伶,丁轶群.基于MB-LDA模型的微博主题挖掘[J].计算机研究与发展,2011,48(10):1795-1802. 被引量:166
  • 2徐戈,王厚峰.自然语言处理中主题模型的发展[J].计算机学报,2011,34(8):1423-1436. 被引量:236
  • 3Blei D M,Ng A Y,Jordan M I.Latent Dirichlet allocation. Journal of Machine Learning Research . 2003
  • 4Zhongwu Zhai,Bing Liu,Hua Xu, et al.Constrained LDA for grouping product features in opinion mining. Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining . 2011
  • 5Hiroshi Fujimoto,Minoru Etoh,Akira Kinno, et al.Topic analysis of web user behavior using LDA model on proxy logs. Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining . 2011
  • 6M. Rosen-Zvi,T. Griffiths,M. Steyvers, et al.The author-topic model for authors anddocuments. Proceedings of the20th conference on Uncertainty in artificialintelligence . 2004
  • 7ARORA R,RAVINDRAN B.Latent Dirichlet allocation and singular value decomposition based multi-document summarization. Proc of Eighth IEEE International Conference on Data Mining . 2008
  • 8Nenkova,L Vanderwende.The impact of frequency on summarization. . 2005
  • 9Ying-Lang Chang,Jen-Tzung Chien.Latent Dirichlet Learning for Document Summarization. Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing . 2009
  • 10ZHANG Chongyang,YANG Jingyu.An improvement to matrix-based LDA. Proceedings of the 3rd International Conference on Artificial Intelligence and Computational Intelligence . 2011

二级参考文献82

  • 1Deerwester S C, Dumais S T, Landauer T K, et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990.
  • 2Hofmann T. Probabilistic latent semantic indexing//Proceedings of the 22nd Annual International SIGIR Conference. New York: ACM Press, 1999:50-57.
  • 3Blei D, Ng A, Jordan M. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 4Griffiths T L, Steyvers M. Finding scientific topics//Proceedings of the National Academy of Sciences, 2004, 101: 5228 5235.
  • 5Steyvers M, Gritfiths T. Probabilistic topic models. Latent Semantic Analysis= A Road to Meaning. Laurence Erlbaum, 2006.
  • 6Teh Y W, Jordan M I, Beal M J, Blei D M. Hierarchical dirichlet processes. Technical Report 653. UC Berkeley Statistics, 2004.
  • 7Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 1977, B39(1): 1-38.
  • 8Bishop C M. Pattern Recognition and Machine Learning. New York, USA: Springer, 2006.
  • 9Roweis S. EM algorithms for PCA and SPCA//Advances in Neural Information Processing Systems. Cambridge, MA, USA: The MIT Press, 1998, 10.
  • 10Hofmann T. Probabilistic latent semantic analysis//Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence. Stockholm, Sweden, 1999:289- 296.

共引文献386

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部