期刊文献+

基于Transformer和重要词识别的句子融合方法

THE METHOD OF SENTENCE FUSION BASED ON TRANSFORMER AND KEYWORD RECOGNITION
下载PDF
导出
摘要 句子融合是为多个句子生成言简意赅、符合语法的句子,可应用到自动摘要、复述生成等自然语言处理任务。目前句子融合方法已取得一定成效,但还存在重要信息缺失、语义偏离原句等问题。该文提出基于Transformer和重要词识别的方法来缓解上述问题。该方法包括两个模块,(1)重要词识别模块:利用序列标注模型识别原句重要词;(2)句子融合模块:将重要词与原句输入到Transformer框架并利用BERT进行语义表示,然后在全连接层引入基于原句和词表获得的向量作为先验知识进行句子融合。基于NLPCC2017摘要任务集构建句子融合数据集,并进行相关实验,结果表明所提方法的性能明显优于基线系统。 Sentence fusion is to generate the concise,grammatical sentence for multiple sentences,which can be applied to many natural language processing tasks such as automatic summary and paraphrase generation.At present,the sentence fusion methods have achieved certain results,but the generated sentence still has the problems such as missing some important information or deviating from the semantics of the original sentences.In order to alleviate above problems,the paper proposes a sentence fusion method based on Transformer and keywords recognition.The method includes two modules.(1)Keyword recognition module used the sequence labeling model to recognize keywords of the original sentences.(2)Sentence fusion module took the keywords and the original sentences into the Transformer framework,and got the semantic representation with BERT.Then it introduced the vector,which was obtained based on the original sentences and the vocabulary in the fully connected layer,as the prior knowledge to generate fusion sentences.Based on the NLPCC2017 summary task set,we constructed sentence fusion set,and conducted related experiments.The results show that the performance of the proposed method is significantly better than the baseline system.
作者 谭红叶 李飞艳 Tan Hongye;Li Feiyan(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,Shanxi,China;Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan 030006,Shanxi,China)
出处 《计算机应用与软件》 北大核心 2023年第7期145-150,共6页 Computer Applications and Software
基金 国家重点研发计划重点专项项目课题(2018YFB1005103) 国家自然科学基金项目(61673248) 山西省研究生联合培养基地人才培养项目(2018JD02)。
关键词 句子融合 重要词 TRANSFORMER 文本生成 Sentence fusion Transformer Text generation
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部