摘要
针对中文自动摘要准确率不高的问题,在含有注意力机制的序列到序列(sequence-to-sequence,seq2seq)基础模型的解码器中融合了复制机制和input-feeding方法,提出了准确率更高的中文自动摘要模型。首先,该模型使用指针网络将出现在源序列中的OOV(out-of-vocabulary)词扩展到固定词典,以实现从源序列复制OOV词到生成序列中;其次,input-feeding方法用于跟踪已生成序列的注意力决定信息以提升模型输出准确率。在NLPCC2018数据集上的实验结果表明,与基础模型相比,所提出模型获得了更高的ROUGE得分,验证了该模型的可行性。
This paper presented a novel model for the lower accuracy issue in Chinese automatic summary which merged copying mechanism and input-feeding approach into the decoder of sequence-to-sequence(seq2 seq) basic model with attention mechanism.Firstly,it used pointer networks to extend the source’s OOV words to a fixed dictionary to copy OOV words from the source into the generated sequence.Secondly,it used the input-feeding approach to track the attention decision information of generated sequence for improving the model output accuracy.Experimental results on NLPCC2018 datasets show that the proposed model obtains a higher ROUGE score than the basic model,which confirms the feasibility of this model.
作者
农丁安
欧阳纯萍
阳小华
Nong Ding’an;Ouyang Chunping;Yang Xiaohua(School of Computer,University of South China,Hengyang Hunan 421001,China)
出处
《计算机应用研究》
CSCD
北大核心
2020年第8期2395-2399,共5页
Application Research of Computers
基金
国家自然科学基金资助项目(61402220,61502221)
湖南省哲学社会科学基金资助项目(16YBA323)
湖南省自然科学基金资助项目(2015JJ3015)
湖南省教育厅青年项目(15B207)。