摘要
针对目前中文命名实体识别研究中存在的语义特征提取不充分,不全面等问题,提出了一种基于BERT-BiLSTM-MHA-CRF的中文命名实体识别方法。该方法首先采用BERT预训练语义模型来获取输入文本的动态词向量表示,从而更好地解决一词多义问题,然后通过BiLSTM网络结合多头注意力机制从多个维度提取出文本的语义特征,最后通过CRF层来获取全局最优的标签序列。该方法利用MSRA数据集和人民日报数据集进行实验分析,结果优于其他对比模型。
Aiming at the problems of insufficient and incomplete semantic feature extraction in Chinese named entity recognition,a Chinese named entity recognition method based on BERT-BiLSTM-MHA-CRF is proposed.Firstly,the BERT pre-trained semantic model is used to obtain the dynamic word vector representation of the input text,so as to better solve the polysemous problem.Then,BiLSTM network and multi-attention mechanism are used to extract the semantic features of the text from multiple dimensions.Finally,the CRF layer is used to obtain the globally optimal label sequence.The experimental results of this method using MSRA and People's Daily data sets are better than other comparison models.
作者
夏成魁
李少波
XIA Chengkui;LI Shaobo(College of Computer Science and Technology,Guizhou University,Guiyang 550025;State Key Laboratory of Public Big Data,Guizhou University,Guiyang 550025)
出处
《计算机与数字工程》
2023年第9期2087-2091,2102,共6页
Computer & Digital Engineering