期刊文献+

基于Transformer与HowNet义原知识融合的双驱动语义蕴含识别

Co-driven Recognition of Semantic Entailment Based on Fusion of Transformer and HowNet Sememe Knowledge
下载PDF
导出
摘要 语义蕴含识别旨在检测和判断两个语句的语义是否一致,以及是否存在蕴含关系.然而现有方法通常面临中文同义词、一词多义现象困扰和长文本难理解的挑战.针对上述问题,本文提出了一种基于Transformer和HowNet义原知识融合的双驱动中文语义蕴含识别方法,首先通过Transformer对中文语句内部结构语义信息进行多层次编码和数据驱动,并引入外部知识库HowNet进行知识驱动建模词汇之间的义原知识关联,然后利用softattention进行交互注意力计算并与义原矩阵实现知识融合,最后用BiLSTM进一步编码文本概念层语义信息并推理判别语义一致性和蕴含关系.本文所提出的方法通过引入HowNet义原知识手段解决多义词及同义词困扰,通过Transformer策略解决长文本挑战问题.在BQ、AFQMC、PAWSX等金融和多语义释义对数据集上的实验结果表明,与DSSM、MwAN、DRCN等轻量化模型以及ERNIE等预训练模型相比,该模型不仅可以有效提升中文语义蕴含识别的准确率(相比DSSM模型提升2.19%),控制模型的参数量(16 M),还能适应50字及以上的长文本蕴含识别场景. Semantic entailment recognition aims to detect and judge whether the semantics of two Chinese sentences are consistent and whether there is an entailment relationship.The existing methods,however,usually face the challenges of Chinese synonyms,polysemy,and difficulty in understanding long texts.To solve the above problems,this study proposes a co-driven Chinese semantic entailment recognition method based on the fusion of Transformer and sememe knowledge of HowNet.First,the internal structural semantic information of Chinese sentences is encoded at multiple levels and undergoes data-driven processing by Transformer.The external knowledge base HowNet is introduced for knowledge-driven modeling of the sememe knowledge correlations between words.Then,the interaction attention is calculated by Soft-Attention and achieves knowledge fusion with the sememe matrix.Finally,BiLSTM is used to encode the semantic information of the conceptual layer of texts and infer and judge the semantic consistency and entailment relationship.The proposed method employs the sememe knowledge of HowNet to solve the problems of polysemy and synonyms and uses the Transformer strategy to resolve the challenge of long texts.The experimental results on financial and multi-semantic interpretation pair data sets such as BQ,AFQMC,and PAWSX show that compared with lightweight models such as DSSM,MwAN,and DRCN and pre-trained models such as ERNIE,this model can effectively improve the recognition accuracy of Chinese semantic entailment(an increase of 2.19%compared with that of the DSSM model)and control the number of model parameters(16 M).In addition,it can also adapt to entailment recognition scenarios of long texts with no less than 50 words.
作者 陈帆 黄炎 张新访 CHEN Fan;HUANG Yan;ZHANG Xin-Fang(School of Mechanical Science&Engineering,Huazhong University of Science and Technology,Wuhan 430074,China;School of Artificial Intelligence and Automation,Huazhong University of Science and Technology,Wuhan 430074,China)
出处 《计算机系统应用》 2023年第5期291-299,共9页 Computer Systems & Applications
基金 国家重点研发计划(2021YFB2012202) 湖北省科技重大专项(2020AEA011) 湖北省重点研发计划(2020BAB100,2021BAA171,2021BAA038)。
关键词 义原知识融合 TRANSFORMER HOWNET 蕴含识别 sememe knowledge fusion Transformer HowNet entailment recognition
  • 相关文献

参考文献3

二级参考文献14

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部