期刊文献+

基于SA-BiLSTM的中文问句相似度计算方法 被引量:1

Chinese Question Similarity Computation Method Based on Self-Attention and Bi-LSTM
下载PDF
导出
摘要 在智能客服问答系统中,用户所提的问句存在着特征稀疏性强、口语化严重以及错别字等特点,导致问句相似度计算的准确率不高,出现答非所问的情况。提出一种基于双向长短时记忆神经网络的问句相似度计算模型SA-BiLSTM。通过对问句进行字向量的表示方法,采用Bi-LSTM提取句子语序关系特征并结合Self-Attention机制动态的调整特征权重,提高模型对问题的理解能力。在微众银行智能客服问句匹配大赛数据集(CCKS2018 Task3)上的实验结果表明,对问句采用字向量表示比词向量表示效果更好,使用自注意力机制可以使模型能学习更多问句中关键特征,SA-BiLSTM模型对问句的识别能力更强,其F1值提高了1.42%。 In the intelligent customer service question answering system, the questions asked by users have the characteristics of strong feature sparseness, serious colloquialization, and typos, which results in the low accuracy of the calculation of the similarity of the question, and an answer beongd the questions. In the paper, we proposed a question similarity computation model based on Bidirectional Long-Short Term Memory SA-BiLSTM. Through the word vector representation method of the question sentence, the Bi-LSTM was used to extract the sentence word order relationship features and the self-attention mechanism was used to dynamically adjust the feature weights, so as to improve the understand ability of the model to the problem. The experimental results on CCKS2018 Task3 show that using character vector representation for question sentences is better than word vector representation. Using self-attention mechanism can enable the model to learn more key feature, the SA-BiLSTM model has stronger ability to recognize the question sentence, and its F1 measure increases by 1.42%.
作者 黄晓洲 段隆振 周玲元 HUANG Xiao-zhou;DUAN Long-zhen;ZHOU Ling-yuan(College of Information Engineering,Nanchang University,Nanchang,330029,China;College of Economics and Management,Nanchang HangKongUniversity,Nanchang,330063,China)
出处 《计算机仿真》 北大核心 2022年第10期486-491,共6页 Computer Simulation
基金 国家自然科学基金资助项目(71761028)。
关键词 句子相似度计算 字向量 自注意力机制 双向长短时记忆网络 Sentence similarity computation Character vector Self-attention Bi-LSTM
  • 相关文献

参考文献8

二级参考文献49

  • 1杨思春.一种改进的句子相似度计算模型[J].电子科技大学学报,2006,35(6):956-959. 被引量:34
  • 2董振东,董强,郝长伶.知网的理论发现[J].中文信息学报,2007,21(4):3-9. 被引量:97
  • 3刘群 李素建.基于《知网》的词汇语义相似度计算[C]..第三界汉语词汇语义研讨会[C].台北,2002..
  • 4刘群,李素建.基于《知网》的词汇语义相似度的计算[C].台北:第三届汉语词汇语义学研讨会,2002.
  • 5Lin Dekang. An information-theoretic definition of similarity semantic distance in WordNet [ C ]//Proceedings of the fif- teenth international conference on machine learning. [ s. 1. ] : [s.n. ] ,1998.
  • 6Jacob B, Benjamin C. Calculating the Jaccard similarity coeffi- cient with map reduce for entity pairs in Wikipedia[ EB/OL]. 2008. http://www, infosci, comell, edu/weblab/papers/ Bank2008. pdf.
  • 7Allan J, Bolivar A, Wade C. Retrieval and novelty detection at the sentence level [ C ]//Proceedings of SIGIR. [ s. 1. ] : [ s. n. ] ,2003:314-321.
  • 8Li Y, McLean D, Bandar Z A, et al. Sentence similarity based on semantic nets and corpus statistics [ J ]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18 (8):1138- 1150.
  • 9Chukfong I-I, Masrah A A M, Rabiah A K, et al. Word sense disambiguation based sentence similarity[ C ]//Proceedings of the 23rd international conference on computational linguistics. [ s. 1. ]: [ s. n. ] ,2010:418-426.
  • 10刘小字.基于语义理解的中文常问问答系统的研究[D].大连:大连理工大学,2006.

共引文献124

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部