期刊文献+

融合关键词的中文新闻文本摘要生成 被引量:4

Chinese news text abstractive summarization with keywords fusion
下载PDF
导出
摘要 针对现有基于seq2seq模型在生成摘要时容易出现语义无关的摘要词,同时没有考虑到关键词在摘要生成中的作用,提出一种融合关键词的中文新闻文本摘要生成方法。首先将源文本词依次输入到Bi-LSTM模型中;然后将得到的时间步隐藏状态输入到滑动卷积神经网络,提取每个词与相邻词之间的局部特征;其次利用关键词信息和门控单元对新闻文本信息进行过滤,去除冗余信息;再通过自注意力机制获得每个词的全局特征信息,最终编码得到具有层次性的局部结合全局的词特征表示;将编码得到的词特征表示输入到带有注意力机制的LSTM模型中解码得到摘要信息。该方法通过滑动卷积网络对新闻词的n-gram特征建模,在此基础上利用自注意力机制,获得具有层次性的局部结合全局的词特征表示。同时,考虑了关键词在新闻摘要生成中的重要作用,利用门控单元去除冗余信息,以获得更精准的新闻文本信息。在搜狗全网新闻语料上的实验表明,该方法能够有效提高摘要生成质量,能够有效地提高ROUGE-1、ROUGE-2、ROUGE-L值。 The existing seq2seq model often suffers from semantic irrelevance when generating summaries,and does not consider the role of keywords in summary generation.Aiming at this problem,this paper proposes a Chinese news text abstractive summarization method with keywords fusion.Firstly,the source text words are input into the Bi-LSTM model in order.The obtained hidden state is input to the sliding convolutional neural network,so local features between each word and adjacent words are extracted.Secondly,keyword information and gating unit are used to filter news text information,so as to remove redundant information.Thirdly,the global feature information of each word is obtained through the self-attention mechanism,and the hierarchical combination of local and global word features representation is obtained after encoding.Finally,the encoded word feature representation is input into the LSTM model with the attention mechanism to decode the summary information.The method models the n-gram features of news words through a sliding convolutional network.Based on this,the self-attention mechanism is used to obtain hierarchical local and global word feature representations.At the same time,the important role of keywords in abstractive summary is considered,and the gating unit is used to remove redundant information to obtain more accurate news text information.Experiments on Sogou's news corpus show that this method can effectively improve the quality of summary generation,and effectively enhance the values of ROUGE-1、ROUGE-2、ROUGE-L.
作者 宁珊 严馨 徐广义 周枫 张磊 NING Shan;YAN Xin;XU Guang-yi;ZHOU Feng;ZHANG Lei(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650504;Yunnan Key Laboratory of Artificial Intelligence,Kunming University of Science and Technology,Kunming,650504;Yunnan Nantian Electronic Information Industry Co.,Ltd.,Kunming 650040,China)
出处 《计算机工程与科学》 CSCD 北大核心 2020年第12期2265-2272,共8页 Computer Engineering & Science
基金 国家自然科学基金(61562049,61462055)。
关键词 文本摘要生成 滑动卷积网络 关键词信息融合 门控单元 全局编码 text abstractive summarization sliding convolutional network keyword information fusion gating unit global coding
  • 相关文献

参考文献6

二级参考文献20

共引文献60

同被引文献36

引证文献4

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部