期刊文献+

丰富特征提取的句子语义等价识别研究

Research on Sentence Semantic Equivalence Identification with Rich Feature Extraction
下载PDF
导出
摘要 句子语义等价识别任务(SSEI)在问答任务中扮演着至关重要的角色.目前,基于中文的语义等价任务在没有给定场景的前提下直接判断两个问句的语义,仍存在相同的意思也会被错误理解的问题.因此,本文提出了一种丰富特征信息提取的RFEM (richer feature extraction model)模型.首先,在编码层中,使用CNN和LSTM分别提取局部特征、存储历史信息特征,融合后的编码信息经过对齐层中的变体多头注意力机制,更大化地保留了原始信息的完整性;其次,在对齐层中,对融入了残差特征的编码进行优化,避免网络加深引起的梯度消失问题,改进后的模型对于特征提取具备更好的效果.该实验结果在公开中文数据集BQ上达到了82.71%,比目前最好的结果高0.86%,在通过置信区间计算清洗后的BQ数据集上达到了93.2%,比基线结果高5.1%. The sentence semantic equivalence identification plays a vital role in the Question Answering tasks.At present,the semantic equivalent task based on Chinese directly judges the semantics of two question sentences without a given scenario.But there is still a problem that the same meaning will be misunderstood.Therefore,this paper proposes an RFEM(richer feature extraction model) model for rich feature information extraction.First of all,in the coding layer,CNN and LSTMare used to extract local features and store historical information features respectively.And the fused encoding information passes through the variant multi-head attention in the alignment layer,which greatly preserves the integrity of the original information;Secondly,in the alignment layer,the code with residual features is optimized to avoid the problem of gradient disappearance caused by the deepening of the network.The improved model has a better effect on feature extraction.The experimental result reached 82.71% on the public Chinese data set BQ,which is 0.86%higher than the current best result,and 93.2% on the cleaned BQ data set calculated by the confidence interval,which is 5.1% higher than the baseline result.
作者 刘高军 寇婕 段建勇 霍卫涛 王昊 LIU Gao-jun;KOU Jie;DUAN Jian-yong;HUO Wei-tao;WANG Hao(College of Information Science,North China University of Technology,Beijing 100144,China;CNONIX National Standard Application and Promotion Lab Department,North China University of Technology,Beijing 100144,China;Artificial Intelligence Research Academy,New Oriental Education&Technology Group,Beijing 100080,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2021年第10期2017-2022,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61972003,61672040)资助。
关键词 句子语义等价识别 特征提取 句子匹配 变体多头注意力机制 sentence semantic equivalence identification feature extraction sentence matching variant multi-head attention
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部