期刊文献+

基于矩阵分解和子模最大化的微博新闻摘要方法 被引量:5

Weibo-oriented news summarization based on matrix factorization and submodular maximization
下载PDF
导出
摘要 针对面向微博的中文新闻摘要的主要挑战,提出了一种将矩阵分解与子模最大化相结合的新闻自动摘要方法。该方法首先利用正交矩阵分解模型得到新闻文本潜语义向量,解决了短文本信息稀疏问题,并使投影方向近似正交以减少冗余;然后从相关性和多样性等方面评估新闻语句集合,该评估函数由多个单调子模函数和一个评估语句不相似度的非子模函数组成;最后设计贪心算法生成最终摘要。在NLPCC2015数据集上的实验结果表明,该方法能有效提高面向微博的新闻自动摘要质量,ROUGE得分超过其他基线系统。 This paper presented a novel method for Weibo-oriented Chinese new summarization which combined matrix factorization and submodular maximization. It used the orthogonal matrix factorization(OrMF) model to solve the information sparsity issue of short texts and the information redundancy problem in the projection procedure, and obtained robust latent vectors for news sentences. Moreover, it evaluated news sentences for its relevance and diversity. The objective function included several submodular functions and a non-submodular function that evaluated sentence dissimilarities. Finally, it designed a greedy algorithm to select summary sentences. Experimental results on NLPCC2015 datasets show that the ROUGE scores of the proposed method outweigh other baseline systems and that the quality of Weibo-oriented news summaries is improved effectively.
出处 《计算机应用研究》 CSCD 北大核心 2017年第10期2892-2896,2928,共6页 Application Research of Computers
基金 国家社科重大招标计划资助项目(11&ZD189) 国家自然科学基金面上资助项目(61373108)
关键词 子模属性 正交矩阵分解 新闻摘要 抽取式摘要 微博 submodularity orthogonal matrix factorization news summarization extractive summarization Weibo
  • 相关文献

参考文献1

二级参考文献19

  • 1Mihalcea R, Tarau P. TextRank: Bringing order into texts[C]//Proceedings bf Association for Computa- tional Linguistics. 2004.
  • 2Erkan G, Radev D R. LexPageRank: Prestige in Multi-Document Text Summarization[C]//Proceedings of EMNLP. 2004, 4: 365-371.
  • 3Wan X, Yang J, Xiao j. Towards an iterative rein- forcement approach for simultaneous document sum- marization and keyword extraction[C]//Proceedings of Annual Meeting-Association for Computational Lin- guistics. 2007, 45(1): 552.
  • 4Hovy E, Lin C Y. Automated text summarization and the SUMMARIST system [C]//Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, Association for Computational Linguistics, 1998: 197-214.
  • 5Lin C Y, Hovy E. The automated acquisition of topic signatures for text summarization[C]//Proceedings of the 18th Conference on Computational Linguistics-Vol- ume 1. Association for Computational Linguistics, 2000: 495-501.
  • 6Nomoto T, Matsumoto Y. A new approach to unsu- pervised text summarization[C]//Proceedings of the 24th Annual international ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2001: 26-34.
  • 7Kupiec J, Pedersen J, Chen F. A trainable document summarizer[C]//Proceedings of the 18th Annual In- ternational ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1995: 68-73.
  • 8Conroy J M, O'leary D P. Text summarization via hid- den markov models[C]//Proceedings of the 24th An- nual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2001: 406-407.
  • 9Carbonell J, Goldstein J. The use of MMR, diversity- based reranking for reordering documents and produ- cing summaries[C]//Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1998: 335-336.
  • 10Gong Y, Liu X. Generic text summarization usingrelevance measure,and latent semantic analysis [C]// Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2001: 19-25.

共引文献10

同被引文献44

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部