摘要
为提高传统命名实体识别模型在中文电子病历上的准确性,提出一种在基线模型BERTBiLSTM-CRF中加入对抗训练的方法,该方法在词嵌入层添加扰动因子从而生成对抗样本,并利用对抗样本进行迭代训练,从而优化模型参数。CCKS2021评测数据集实验结果表明,加入FGM和PGD两个对抗训练模型后,其精准率、召回率以及F_(1)值相比于基线模型均有所提升。并且通过对比实验,验证了加入对抗训练能够提高模型的预测能力和鲁棒性。
In view of an improvement of the accuracy of the traditional named entity recognition model in Chinese electronic medical records,a method has thus been proposed with adversarial training added to the baseline model BERT-BILSTM-CRF.By adopting the proposed method,disturbance factors are added to the word embedding layer for the generation of adversarial samples,which will be used for an iterative training to optimize the model parameters.The experimental results of CCKS2021 evaluation data set show that the accuracy rate,recall rate and F1 value are improved compared with the baseline model with FGM and PGD confrontation training models added.Based on comparative experiments,it is verified that adding confrontation training can improve the prediction ability and robustness of the model.
作者
孔令巍
朱艳辉
张旭
欧阳康
黄雅淋
金书川
沈加锐
KONG Lingwei;ZHU Yanhui;ZHANG Xu;OUYANG Kang;HUANG Yalin;JIN Shuchuan;SHEN Jiarui(College of Computer Science,Hunan University of Technology,Zhuzhou Hunan 412007,China;Key Laboratory of Intelligent Information Perception and Processing Technology of Hunan Province,Zhuzhou Hunan 412007,China)
出处
《湖南工业大学学报》
2022年第3期36-43,共8页
Journal of Hunan University of Technology
基金
湖南省自然科学基金资助项目(2020JJ6089)
湖南省教育厅科研基金资助重点项目(19A133)。