期刊文献+

基于BERT双向预训练的图模型摘要抽取算法 被引量:4

Graph model summary extraction algorithm based on BERT bidirectional pretraining
下载PDF
导出
摘要 近年的自动摘要算法大多是基于监督学习机制,没有考虑到人工标记语料的烦琐,并且大多数摘要模型在句子嵌入时不能结合上下文来更充分表达语义信息,忽略了文本的整体信息。针对上述问题提出了一种基于改进的BERT双向预训练语言模型与图排序算法结合的抽取式摘要模型。根据句子的位置以及上下文信息将句子映射为结构化的句子向量,再结合图排序算法选取影响程度靠前的句子组成临时摘要;为了避免得到有较高冗余度的摘要,对得到的临时摘要进行冗余消除。实验结果表明在公用数据集CNN/DailyMaily上,所提模型能够提高摘要的得分,相对于其他改进的基于图排序摘要抽取算法,该方法效果更佳。 In recent years,most of the automatic summary algorithms are about supervised learning mechanisms,which don’t take into account the cumbersomeness of artificial markers,can’t express semantic information more fully in context when the sentence is embedded,ignoring the overall information of the text.To solve the above problem,this paper proposed an extractive summary model based on the improved BERT bidirectional pre-trained language model combined with the graph sorting algorithm.According to the position of the sentence and the context information,this model mapped the sentence as a structured sentence vector,and combined with the graph sorting algorithm to select the sentence with the highest impact to form a temporary summary.In order to avoid obtaining a high degree of redundancy of the summary,it eliminated the redundancy of the temporary summary.The experimental results show that this model can improve the score of the summary on the common data set CNN/Daily Maily,and the experiment proves that the proposed method is more effective than other improved graph-based sort summary extraction algorithms.
作者 方萍 徐宁 Fang Ping;Xu Ning(School of Computer Science&Technology,Wuhan University of Technology,Wuhan 430070,China;School of Information Engineering,Wuhan University of Technology,Wuhan 430070,China)
出处 《计算机应用研究》 CSCD 北大核心 2021年第9期2657-2661,共5页 Application Research of Computers
关键词 抽取式摘要 BERT 图排序算法 冗余消除 extractive summary BERT graph sorting algorithm redundancy elimination
  • 相关文献

参考文献4

二级参考文献28

  • 1秦兵,刘挺,李生.多文档自动文摘综述[J].中文信息学报,2005,19(6):13-20. 被引量:51
  • 2胡学钢,董学春,谢飞.基于词向量空间模型的中文文本分类方法[J].合肥工业大学学报(自然科学版),2007,30(10):1261-1264. 被引量:14
  • 3Gomaa W H, Fahmy A A. A survey of text similarity ap- proaches [J]. International Journal of Computer Applica-tions, 2013,68(13) : 13- 18.
  • 4Gupta N, Saxena P C, Gupta J P. Document summarization based on sentence ranking using vector space model[J]. In- ternational Journal of Data Mining, Modeling and Manage- ment, 2013,5(4) : 380-406.
  • 5Figueiredo F, Rocha L, Couto T, et al. Word features for text classification [J]. Information Systems, 2011,36(5) ,843-858.
  • 6Matsuo Y, Ishlzuka M. Keyword extraction from a single document using word statistical information [J]. International Journal on Artificial Intelligence Tools, 2004,13(01) : 157- 169.
  • 7Loeekx D, Slagrnolen P, Maes F, et al. Nonrigid image regis- tration using conditional mutual information [J ]. IEEE Transactions on Medical Imaging, 2010,29 (1) : 19- 29.
  • 8LI X,DU L,SHEN Y. Update summarization via graph-based sen- tence ranking[J]. Knowledge and Data Engineering, 2013, 25(5) : 1162-1174.
  • 9DANG H T,OWCZARZAK K. Update summarization task[C]. Overview of the TAC 2008, 2008.
  • 10WAN X. TimedTextRank: adding the temporal dimension to multi document summarization[C]. ACM, 2007.

共引文献29

同被引文献34

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部