期刊文献+

一种增强的多粒度特征融合语义匹配模型 被引量:1

An Enhanced Multi Granularity Feature Fusion Model for Semantic Matching
下载PDF
导出
摘要 语义匹配作为自然语言处理任务中重要的一环,直接制约问答系统、信息检索等任务的效率。针对现有语义模型大多只以词为基本语义单元进行注意力交互,较少考虑中文中的词边界模糊和字符信息获取不足而带来的语言颗粒度对整体建模忽略的问题,提出一种增强的多粒度特征融合语义匹配模型EMGFM。首先结合BERT模型和word2vec以获得增强的字符向量表示,然后从字、词、句三种粒度进行注意力的交互,并对交互结果进行加权融合,以突出不同交互信息对整体建模的贡献。为减少交互过程中产生的信息损失,通过构造差异性来对交互信息进行信息增强。最后通过最大池化、平均池化两种方式获得文本的最终语义表示以进行匹配度的计算。该模型在CCKS问句匹配大赛中文数据集上达到了87%的正确率,相比于一些语义匹配的经典模型准确率均有提升,证明该方法确实能有效提升问句语义匹配的准确性。 As an important part of natural language processing tasks,semantic matching directly restricts the efficiency of question answering system,information retrieval and other tasks.Most of the existing semantic models only take words as the basic semantic unit for attention interaction,and less take into account the problem of language granularity ignoring the overall modeling caused by the fuzzy word boundary and insufficient acquisition of character information in Chinese.Therefore,an enhanced multi granularity feature fusion semantic matching model EMGFM is proposed.Firstly,the BERT model and word2vec are combined to obtain the enhanced character vector representation,then the attention interaction is carried out from the three granularity of words,phrases and sentences,and the interaction results are weighted fused to highlight the contribution of different interaction information to the overall modeling.In order to reduce the information loss in the interactive process,the interactive information is enhanced by constructing differences.Finally,the final semantic representation of the text is obtained by maximum pooling and average pooling to calculate the matching degree.The model achieves 87%accuracy on the Chinese data set of CCKS question matching competition.Compared with some classical models of semantic matching,the accuracy is improved.It proves that the proposed method can effectively improve the accuracy of question semantic matching.
作者 尚福华 蒋毅文 曹茂俊 SHANG Fu-hua;JIANG Yi-wen;CAO Mao-jun(School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China)
出处 《计算机技术与发展》 2022年第7期28-33,共6页 Computer Technology and Development
基金 黑龙江省自然科学基金(LH2019F004) 东北石油大学青年科学基金(2018QNL-25) 东北石油大学优秀中青年科研创新团队(KYCXTD201903)。
关键词 语义匹配 语言颗粒度 Siamese网络 可分解注意力机制 BERT模型 semantic matching language granularity Siamese network decomposable attention mechanism BERT model
  • 相关文献

参考文献7

二级参考文献49

  • 1吴友政,赵军,段湘煜,徐波.问答式检索技术及评测研究综述[J].中文信息学报,2005,19(3):1-13. 被引量:48
  • 2荀恩东,颜伟.基于语义网计算英语词语相似度[J].情报学报,2006,25(1):43-48. 被引量:41
  • 3曹泽文,钱杰,张维明,邓苏.一种综合的概念相似度计算方法[J].计算机科学,2007,34(3):174-175. 被引量:35
  • 4黄果,周竹荣,周亭.基于领域本体的语义相似度计算研究[J].计算机工程与科学,2007,29(5):112-117. 被引量:21
  • 5Raftopoulou P, Petrakis E. Semantic Similarity Measures: A Comparison Study[R]. 2005.
  • 6An Information - Theoretic Definition of Similarity [ EB/OL ]. [2007 - 12 -20 ]. http://www, cs. ualberta, ea/lindek/papers/ sim. pdf.
  • 7Rada R, Mili H, Bicknell E, et al. Development and Application of a Metric on Semantic Nets[ J 1- IEEE Transactions on Systems, Man, and Cybernetics, 1989,19(1) :17 -30.
  • 8Richardson R, Smeaton A, Murphy J. Using WordNet as a Knowledge Base for Measuring Semantic Similarity Between Words [ EB/ OL]. [2008 -12 -02 ]. http://citeseerx. ist. psu. edu/viewdoc/download? doi = 10.1.1. 124. 4773&rep = repl &type = pdf.
  • 9Wu Z, Palmer M. Verb Semantics and Lexical Selection[C]. In: Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics. 1994 : 133 - 138.
  • 10Antoniou G, Van Harmelen F. A Semantic Web Primer[ M]. The Mit Press ,2004:200 - 208.

共引文献274

同被引文献17

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部