摘要
【目的】生成两段文本之间具有对比关系的句子,为自动生成对比关系段落文本提供基础模型。【方法】将对比关系句子生成任务看作是由两段文本组成的文本序列到两者之间对比关系文本序列的自动生成,设计一个基于Seq2Seq的深度学习模型,在字符向量的基础上融入对比特征对输入文本进行表示,Encoder层和Decoder都采用BiLSTM结构,同时在模型中引入Attention机制。【结果】在人工标注的查新单及科技论文数据集上展开实验,采用BLEU作为生成效果评价指标,最后评价得分为12.1,比直接使用BiLSTM+Attention的基准模型得分高6.5。【局限】由于人工标注对比关系句子的复杂性,实验所用的数据量有限。【结论】该模型能够生成一定程度上可读并且具有对比关系的句子,可以作为对比关系段落文本生成的基础模型。
[Objective]This paper tries to generate contrastive sentences from two related paragraphs,aiming to establish a new model for creating contrastive paragraphs.[Methods]We generated contrastive sentences automatically from contrastive text sequences.We designed a deep learning model based on Seq2seq,which incorporated contrast features with character vectors to represent texts.Both the Encoder and Decoder layers of our model used BiLSTM structure,which also included attention mechanism.[Results]We examined the proposed model with manually annotated search lists and scientific papers.Then,we adopted BLEU as evaluation index for the results.The final evaluation score was 12.1,which was 6.5 higher than those of the benchmark model using BiLSTM+Attention.[Limitations]Due to the complexity of manually labeling,the data size in our experiments was small.[Conclusions]The proposed model could be used to build new model for generating contrastive paragraphs.
作者
焦启航
乐小虬
Jiao Qihang;Le Xiaoqiu(National Science Library,Chinese Academy of Sciences,Beijing 100190,China;Department of Library,Information and Archives Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第6期43-50,共8页
Data Analysis and Knowledge Discovery
关键词
对比关系
文本生成
文本表征
深度学习
Contrast Relationship
Text Generation
Text Representation
Deep Learning