期刊文献+

基于BERT和BiLSTM的语义槽填充 被引量:4

Semantic Slot Filling Based on BERT and BiLSTM
下载PDF
导出
摘要 语义槽填充是对话系统中一项非常重要的任务,旨在为输入句子的每个单词标注正确的标签,其性能的好坏极大地影响着后续的对话管理模块。目前,使用深度学习方法解决该任务时,一般利用随机词向量或者预训练词向量作为模型的初始化词向量。但是,随机词向量存在不具备语义和语法信息的缺点;预训练词向量存在“一词一义”的缺点,无法为模型提供具备上下文依赖的词向量。针对该问题,提出了一种基于预训练模型BERT和长短期记忆网络的深度学习模型。该模型使用基于Transformer的双向编码表征模型(Bidirectional Encoder Representations from Transformers,BERT)产生具备上下文依赖的词向量,并将其作为双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)的输入,最后利用Softmax函数和条件随机场进行解码。将预训练模型BERT和BiLSTM网络作为整体进行训练,达到了提升语义槽填充任务性能的目的。在MIT Restaurant Corpus,MIT Movie Corpus和MIT Movie trivial Corpus 3个数据集上,所提模型得出了良好的结果,最大F1值分别为78.74%,87.60%和71.54%。实验结果表明,所提模型显著提升了语义槽填充任务的F1值。 Semantic slot filling is an important task in the dialogue system,which aims to label each word of the input sentence correctly.Slot filling performance has a marked impact on the following dialog management module.At present,random word vector or pretrained word vector is usually used as the initialization word vector of the deep learningmodel used to solveslot filling task.However,the random word vector has no semantic and grammatical information,and the pre-trained word vector only pre-sent one meaning.Both of them cannot provide context-dependent word vector for the model.We proposed an end-to-end neural network model based on pre-trained model BERTand Long Short-Term Memory network(LSTM).First,the pre-trained model(BERT)encoded the input sentence as context-dependentword embedding.After that,the word embedding served as input to subsequent Bidirectional Long Short-Term Memory network(BiLSTM).Andusing the Softmax function and conditional random field to decode prediction labels finally.The pre-trained model BERT and BiLSTM networks were trained as a wholein order to improve the performance of semantic slot filling task.The model achieves F1 scores of 78.74%,87.60%and 71.54%on three data sets(MIT Restaurant Corpus,MIT Movie Corpus and MIT Movie trivial Corpus)respectively.The experimental results show that our model significantly improves the F1 value of Semantic slot filling task.
作者 张玉帅 赵欢 李博 ZHANG Yu-shuai;ZHAO Huan;LI Bo(College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China)
出处 《计算机科学》 CSCD 北大核心 2021年第1期247-252,共6页 Computer Science
基金 国家重点研发计划(2018YFC0831800)。
关键词 语义槽填充 预训练模型 长短期记忆网络 上下文依赖 词向量 Slot filling Pre-trained model Long short-term memory network Context-dependent Word embedding
  • 相关文献

参考文献3

二级参考文献12

共引文献48

同被引文献29

引证文献4

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部