摘要
基于文本交互信息对文本语义匹配模型的重要性,提出一种结合序列生成任务的自监督学习方法。该方法利用自监督模型提取的文本数据对的交互信息,以特征增强的方式辅助基于神经网络的语义匹配模型,构建多任务的文本匹配模型。9个模型的实验结果表明,加入自监督学习模块后,原始模型的效果都有不同程度的提升,表明所提方法可以有效地改进深度文本语义匹配模型。
In semantic matching,the interaction information between pairs of texts is critical in predicting a matching score for the pairs.This paper proposes a multi-task learning framework with self-supervised learning for deep learning semantic matching problem.Specifically,a self-supervised model is designed for the paired sentences to regenerate each other with sequence-to-sequence generation method.Then a multi-task learning framework integrates the representation from the self-supervised generation with that of the deep matching model to predict the similarity score of the texts.Experimentations with 9 deep matching models prove that the proposed framework can improve the performances of the traditional deep matching models.
作者
陈源
丘心颖
CHEN Yuan;QIU Xinying(School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006;Guangzhou Key Laboratory of Multilingual Intelligent Processing,Guangdong University of Foreign Studies,Guangzhou 510006)
出处
《北京大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2022年第1期83-90,共8页
Acta Scientiarum Naturalium Universitatis Pekinensis
基金
国家社会科学基金(17BGL068)
广东省自然科学基金(2018A030313777)资助。
关键词
自监督学习
文本语义匹配
多任务学习
self-supervised learning
semantic matching
multi-task learning