摘要
在标题自动生成任务中,BiLSTM表示文本是随着时间循环递归对每个单词进行编码,需要逐字读取单词序列,语义信息会随着状态的传递不断减弱。对此,构建一个句子级LSTM的编码器,并行对文本中每个单词编码表示。循环步骤同时对单词之间的局部状态和整体文本的全局状态进行信息交换,编码得到语义表示后使用混合指针网络的解码器生成标题。在相关数据集上进行实验,结果验证了该模型在标题生成任务上的有效性。
In the automatic title generation task,BiLSTM encodes each word with the recurrent time in text representation,which makes the sequence of words needs to be read word by word,and the semantic information will be weakened along with the state transition.This paper constructs a sentence-level LSTM encoder,which encodes each word in parallel.The recurrent step was used to exchange information between the local state of words and the global state of the overall text.After getting the semantic representation,the headline was generated using a decoder of the mixed pointer network.The experiments on the relevant data sets verify the validity of the model on the headline generation task.
作者
钱揖丽
马雪雯
Qian Yili;Ma Xuewen(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,Shanxi,China;Key Laboratory of Ministry of Education for Computational Intelligence and Chinese Information Processing,Shanxi University,Taiyuan 030006,Shanxi,China)
出处
《计算机应用与软件》
北大核心
2021年第5期190-195,共6页
Computer Applications and Software
基金
国家重点研发计划重点专项项目(2018YFB1005103)
国家自然科学基金项目(61573231,61673248)。
关键词
标题生成
句子级
LSTM
序列到序列模型
Headline generation
Sentence-level
LSTM
Sequence-to-sequence model