摘要
孪生网络预训练语言模型(Sentence Embeddings using Siamese BERT-Networks,SBERT)在文本匹配的表示层面上存在两个缺点:(1)两个文本查询经BERT Encoder得到向量表示后,直接进行简单计算;(2)该计算不能考虑到文本查询之间更细粒度表示的问题,易产生语义上的偏离,难以衡量单个词在上下文中的重要性。该文结合交互方法,提出一种结合多头注意力对齐机制的SBERT改进模型。该模型首先获取经SBERT预训练的两个文本查询的隐藏层向量;然后,计算两文本之间的相似度矩阵,并利用注意力机制分别对两个文本中的token再次编码,从而获得交互特征;最后进行池化,并整合全连接层进行预测。该方法引入了多头注意力对齐机制,完善了交互型文本匹配算法,加强了相似文本之间的关联度,提高了文本匹配效果。在ATEC 2018 NLP数据集及CCKS 2018微众银行客户问句匹配数据集上,对该方法进行验证,实验结果表明,与当前流行的5种文本相似度匹配模型ESIM、ConSERT、BERT-whitening、SimCSE以及baseline模型SBERT相比,本文模型在F1评价指标上分别达到了84.7%和90.4%,比Baseline分别提高了18.6%和8.7%,在准确率以及召回率方面也表现出了较好的效果,且具备一定的鲁棒性。
The sentence embeddings using Siamese BERT-Networks pre-trained language model has two shortcomings in its presentation layer for text matching,that is,(1)two queried texts are directly computed after they are represented in vectors by the BERT Encoder,(2)such computation does not consider the needs to refine the granular representation of the two queried texts.As such presented semantics could be deviated and it is also difficult to assess the importance of single words in text matching.This paper proposes an improved text similarity matching model SBMAA based on SBERT pre-trained language model.Firstly,the hidden layer vectors of the two queries passing through the SBERT model are obtained,and then the similarity matrix between the two is calculated.The attention mechanism is used to encode the tokens in the two sentences again to obtain interactive features and pool them.Finally,the fully connected layer is connected for prediction.This method introduces the multi-head attention alignment mechanism,which is a common way of interactive text matching algorithm,and strengthens the correlation degree between similar texts,so that the model can achieve more accurate matching effect.The experimental results on ATEC 2018 NLP data set and CCKS 2018 Webank Customer Question Matching dataset show that compared with the five popular text similarity matching models ESIM,ConSERT,BERT-whitening,SimCSE and Baseline model SBERT,The proposed SBMAA model achieves 84.7%and 90.4%in F1evaluation index,18.6%and 8.7%higher than Baseline,respectively.It also shows good effect in accuracy and recall rate,and has certain robustness.
作者
卢美情
申妍燕
LU Meiqing;SHEN Yanyan(Faculty of Intelligent Manufacturing,Wuyi University,Jiangmen 529020,China;Institute of Advanced Computing and Digital Engineering,Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China)
出处
《集成技术》
2023年第2期53-63,共11页
Journal of Integration Technology
基金
国家重点研发计划项目(2019YFB1405200)
广东省2019年省拨高建“冲补强”专项项目(5041700175)
教育部第二批新工科研究与实践项目(E-RGZN20201036)。