期刊文献+

融合多角度特征的文本匹配模型 被引量:1

Text Matching Model Incorporating Multi-angle Features
下载PDF
导出
摘要 文本匹配是自然语言处理的一个核心研究领域,深度文本匹配模型大致可以分为表示型和交互型两种类型,表示型模型容易失去语义焦点难以衡量词上下文重要性,交互型模型缺少句型、句间等全局性信息.针对以上问题提出一种融合多角度特征的文本匹配模型,该模型以孪生网络为基本架构,利用BERT模型生成词向量进行词相似度融合加强语义特征,利用Bi-LSTM对文本的句型结构特征进行编码,即融合文本词性序列的句型结构信息,使用Transformer编码器对文本句型结构特征和文本特征进行多层次交互,最后拼接向量推理计算出两个文本之间的相似度.在Quora部分数据集上的实验表明,本模型相比于经典深度匹配模型有更好的表现. Text matching is a core research area in natural language processing.Deep text matching models can be broadly classified into representational models and interactive models.The former tends to lose semantic focus and fails to measure the contextual importance of words.The latter lacks global information such as sentence type and intersentence information.To address these problems,we propose a text matching model incorporating multi-angle features based on Siamese neural network.The model generates word vectors using the BERT model and enhances semantic features by the similarity fusion of words.It then encodes the syntactic structured features using Bi-LSTM,namely the syntactic structured information containing the lexical sequence.A Transformer encoder is utilized to realize the multilevel interaction between the syntactic structured features and the text features.Finally,the similarity is deduced by spliced vectors.Experiments on part of Quora question pair show that this model performs better than the classical deep matching model.
作者 李广 刘新 马中昊 黄浩钰 张远明 LI Guang;LIU Xin;MA Zhong-Hao;HUANG Hao-Yu;ZHANG Yuan-Ming(School of Computer Science&School of Cyberspace Security,Xiangtan University,Xiangtan 411105,China)
出处 《计算机系统应用》 2022年第7期158-164,共7页 Computer Systems & Applications
基金 智能化公共法律服务关键技术湖南省重点研发项目(2022SK2106)。
关键词 文本匹配 句型结构 Transformer框架 孪生网络 Bi-LSTM 特征融合 注意力机制 自然语言处理 text matching sentence structure Transformer framework Siamese neural network Bi-LSTM feature fusion attention mechanism natural language processing(NLP)
  • 相关文献

参考文献2

二级参考文献26

  • 1Fung B C M,Wang K,Ester M.Hierarchical document clustering//Wang John ed.The Encyclopedia of Data Warehousing and Mining,idea Group.2005:970-975.
  • 2Salton G.The SMART Retrieval System-Experiments in Automatic Document Processing.Englewood Cliffs,New Jersey:Prentice Hall Inc,1971.
  • 3Wang Y,Julia H.Document clustering with semantic analysis//Proceedings of the 39th Hawaii International Conferences on System Sciences.Hawaii,US,2006:54-63.
  • 4Hotho A,Staab S,Stumme G.Wordnet improves text document clustering//Proceedings of the Semantic Web Workshop at SIGIR-2003,26th Annual International ACM SIGIR Conference.Toronto,Canada,2003:541-550.
  • 5Hall P,Dowling G.Approximate string matching.Computing Survey,1980,12(4):381-402.
  • 6Coelho T,Calado P,Souza L,Ribeiro-Neto B,Muntz R.Image retrieval using multiple evidence ranking.IEEETransactions on Knowledge and Data Engineering,2004,16(4):408-417.
  • 7Ko Y,Park J,Seo J.Improving text categorization using the importance of sentences.lnformation Processing and Management,2004,40(1):65-79.
  • 8Erkan G,Radev D.Lexrank:Graph-based lexical centrality as salience in text summarization.Journal of Artificial Intelligence Research,2004,22(7):457-479.
  • 9Theobald M,Siddharth J,Paepcke A.SpotSigs:Robust and efficient near duplicate detection in large Web collections//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Singapore,2008:563-570.
  • 10Han J,Kamber M.Data Mining:Concept and Techniques.2nd Edition.San Francisco,CA,USA:Elsevier Inc,2006.

共引文献220

同被引文献12

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部