摘要
在中文命名实体识别任务中,将字向量化表示是重要的步骤,然而传统的词向量表示方法只能将字映射为单一向量,无法表征字的多义性。对此引入了BERT预训练语言模型,BERT模型可以增强字的语义表示,根据其上下文动态生成语义向量。而针对BERT微调训练对计算机性能要求较高的问题,采用了固定参数嵌入的方式对BERT进行应用,并搭建了BERT-BiLSTM-CRF模型。实验结果表明,基于BERT的命名实体识别模型在MSRA数据集上的F1-Score指标达到94.48%。优于传统机器学习模型和其他基于深度学习模型的方法。研究结果表明,BERT模型在命名实体识别任务中具有很好的应用前景。
In the task of Chinese named entity recognition,the vectorization of word is an important step.However,the traditional word vector representation method can only map the word to a single vector and cannot represent the ambiguity of the word.In this paper,BERT pre-training language model is introduced.BERT model can enhance the semantic representation of words and dynamically generate semantic vectors according to their context.In order to solve the problem that BERT fine tuning training requires high computer performance,this paper applies BERT with fixed parameter embedding method,and builds the Bert-BiLSTM-CRF model.The experimental results show that the F1-Score index of the named entity recognition model based on BERT reaches 94.48%on the MSRA dataset.It is superior to traditional machine learning models and other methods based on deep learning models.The results of this paper show that BERT model has a good application prospect in named entity recognition tasks.
作者
赵英明
王浩森
赵明瞻
ZHAO Yingming;WANG Haosen;ZHAO Mingzhan(Hebei University of Architecture,Zhangjiakou,Hebei 075000)
出处
《河北建筑工程学院学报》
CAS
2024年第1期253-257,共5页
Journal of Hebei Institute of Architecture and Civil Engineering