期刊文献+

基于词句协同排序的单文档自动摘要算法 被引量:8

Single document automatic summarization algorithm based on word-sentence co-ranking
下载PDF
导出
摘要 对于节录式自动摘要需要从文档中提取一定数量的重要句子,以生成涵盖原文主旨的短文的问题,提出一种基于词句协同排序的单文档自动摘要算法,将词句关系融入以图排序为基础的句子权重计算过程中。首先给出了算法中词句协同计算的框架;然后转化为简洁的矩阵表示形式,并从理论上证明了收敛性;最后进一步通过去冗余方法提高自动摘要的质量。真实数据集上的实验表明,基于词句协同排序的自动摘要算法较经典的TextRank算法在Rouge指标上提升13%~30%,能够有效提高摘要的生成质量。 Focusing on the issue that extractive summarization needs to automatically produce a short summary of a document by concatenating several sentences taken exactly from the original material. A single document automatic summarization algorithm based on word-sentence co-ranking was proposed, named WSRank for short, which integrated the word-sentence relationship into the graph-based sentences ranking model. The framework of co-ranking in WSRank was given, and then was converted to a quite concise form in the view of matrix operations, and its convergence was theoretically proved. Moreover, a redundancy elimination technique was presented as a supplement to WSRank, so that the quality of automatic summarization could be further enhanced. The experimental results on real datasets show that WSRank improves the performance of summarization by 13% to 30% in multiple Rouge metrics, which demonstrates the effectiveness of the proposed method.
出处 《计算机应用》 CSCD 北大核心 2017年第7期2100-2105,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(71571093 71372188) 国家电子商务信息处理国际联合研究中心项目(2013B01035) 江苏省高校自然科学基金资助项目(15KJB520012) 南京财经大学校预研究资助项目(YYJ201415)~~
关键词 自动摘要 节录式摘要 单文档 图排序 词句协同 automatic summarization extractive summary single document graph-based ranking word-sentence collaboration
  • 相关文献

参考文献2

二级参考文献16

  • 1http://projects.ldc.upenn.edu/ace/intro.html.
  • 2Mani I. Automatic Summarization. John Benjarnins Publishing Company, 2001.
  • 3Zhang S, Zhao TJ, Yu H, Zhao H. The research on the influence of the types of document sets on multi-document summarization. Journal of Computational Information Systems, 2007,3(3):1201-1206.
  • 4Dang HT, Owczarzak K. Overview of the TAC 2008 Update Summarization Task. In: Proc. of the Text Analysis Conf. 2008.
  • 5Allan J, Jin H, Rajman M, Wayne C, Gildea D, Lavrenko V, Hoberman R, Caputo D. Topic-Based novelty detection. Technical Report, ws99, Baltimore: Center for Language and Speech Processing, Johns Hopkins University, 1999.
  • 6Allan J, Papka R, Lavrenko V. On-Line new event detection and tracking. In: Proc. of the 21st Annual Int'l ACM SIGIR Conf. on Research and Development in Information Retrieval. Melbourne, 1998.37-45. [doi: 10.1145/290941.290954].
  • 7Mani I. Recent developments in temporal information extraction. In: Nicolov N, Mitkov R, eds. Proc. of the RANLP. 2004.
  • 8Makkonen J. Investigations on event evolution in TDT. In: Proc. of the Student Workshop of Human Language Technology Conf. of the North American Chapter of the Association for Computational Linguistics. Edmonton, 2003. 43-48. Idol: 10.3115/1073416. 1073424].
  • 9Mani I, Wilson G. Robust temporal processing of news. In: Proc. of the 38th Annual Meeting on Association for Computational Linguistics. Hong Kong, 2000. 69-76. [doi: 10.3115/1075218:1075228].
  • 10Lin CY, Hovy E. Automatic evaluation of summaries using N-gram cooccurrence statistics. In: Proc. of the 2003 Conf. of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL 2003). Morristown: Association for Computational Linguistics, 2003.71-78. [doi: 10.3115/1073445.1073465].

共引文献16

同被引文献72

引证文献8

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部