摘要
为了解决序列到序列模型中编码器不能充分编码源文本的问题,构建一种基于双编码器网络结构的CGAtten-GRU模型。2个编码器分别使用卷积神经网络和双向门控循环单元,源文本并行进入双编码器,结合2种编码网络结构的输出结果构建注意力机制,解码器端使用GRU网络融合Copy机制和集束搜索方法,以提高解码的准确度。在大规模中文短文本摘要数据集LCSTS上的实验结果表明,与RNN context模型相比,该模型的Rouge-1、Rouge-2和Rouge-L分别提高0.1、0.059和0.046。
This paper constructs a CGAtten-GRU model based on dual-encoder network structure to solve the problem that the encoder cannot fully encode the source text in the sequence-to-sequence(seq2seq)model.The two encoders use Convolutional Neural Network(CNN)and Bidirectional Gated Recurrent Unit(BiGRU)respectively,and the source text enters the two encoders in parallel.An attention mechanism is constructed by means of the outputs of two encoding networks.The decoder uses GRU network combining the Copy mechanism and the beam search method to improve the accuracy of decoding.Experimental results on large-scale Chinese short text summarization dataset LCSTS show that compared with the RNN context model,the proposed model improves Rouge-1 by 0.1,Rouge-2 by 0.059,and Rouge-L by 0.046.
作者
冯读娟
杨璐
严建峰
FENG Dujuan;YANG Lu;YAN Jianfeng(School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2020年第6期60-64,共5页
Computer Engineering
基金
国家自然科学基金(61572339,61272449)
江苏省科技支撑计划重点项目(BE2014005)。
关键词
自然语言处理
生成式摘要
卷积神经网络
门控循环单元
注意力机制
序列到序列模型
Copy机制
Natural Language Processing(NLP)
abstractive summarization
Convolutional Neural Network(CNN)
Gated Recurrent Unit(GRU)
attention mechanism
sequence-to-sequence(seq2seq)model
Copy mechanism