摘要
文本蕴含识别旨在识别两个给定句子之间的逻辑关系.本文通过构造语义角色和自注意力机制融合模块,把句子的深层语义信息与Transformer模型的编码部分相结合,从而增强自注意力机制捕获句子语义的能力.针对中文文本蕴含识别在数据集上存在规模小和噪声大的问题,使用大规模预训练语言模型能够提升模型在小规模数据集上的识别性能.实验结果表明,提出的方法在第十七届中国计算语言学大会中文文本蕴含识别评测数据集CNLI上的准确率达到了80.28%.
Recognizing textual entailment is intended to infer the logical relationship between two given sentences.In this paper,we incorporate the deep semantic information of sentences and the encoder of Transformer by constructing the SRL-Attention fusion module,and it effectively improves the ability of self-attention mechanism to capture sentence semantics.Furthermore,concerning the small scale and high noise problems on the dataset,we use large-scale pre-trained language model improving the recognition performance of model on small-scale dataset.Experimental results show that the accuracy of our model on the dataset CNLI,it is released as Chinese textual entailment recognition evaluation corpus at the 17th China National Conference on Computational Linguistics,reaches 80.28%.
作者
张志昌
曾扬扬
庞雅丽
ZHANG Zhi-chang;ZENG Yang-yang;PANG Ya-li(College of Computer Science and Engineering,Northwest Normal University,Lanzhou,Gansu 730000,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2020年第11期2162-2169,共8页
Acta Electronica Sinica
基金
国家自然科学基金(No.61762081,No.61662067,No.61662068)
甘肃省重点研发计划(No.17YF1GA016)。
关键词
自然语言处理
文本蕴含
自注意力机制
语义角色标注
预训练语言模型
natural language processing
textual entailment
self-attention mechanism
semantic role labeling
pre-trained language model