期刊文献+

融合知识的中文医疗实体识别模型

Chinese medical entity recognition model with knowledge fusion
下载PDF
导出
摘要 从医疗文本中抽取知识对构建医疗辅助诊断系统等应用具有重要意义。实体识别是其中的核心步骤。现有的实体识别模型大都是基于标注数据的深度学习模型,非常依赖高质量大规模的标注数据。为了充分利用已有的医疗领域词典和预训练语言模型,本文提出了融合知识的中文医疗实体识别模型。一方面基于领域词典提取领域知识,另一方面,引入预训练语言模型BERT作为通用知识,然后将领域知识和通用知识融入到模型中。此外,本文引入了卷积神经网络来提高模型的上下文建模能力。本文在多个数据集上进行实验,实验结果表明,将知识融合到模型中能够有效提高中文医疗实体识别的效果。 Extracting knowledge from medical texts is of great significance to the construction of medical auxiliary diagnosis system and other applications.Entity recognition is an important step.Most of the existing entity recognition models are based on the deep learning model of annotation data,which rely heavily on high-quality large-scale annotation data.In order to make full use of the existing medical dictionary and pre-training language model,this paper proposes a Chinese medical entity recognition model with knowledge fusion.On one hand,domain knowledge is extracted based on domain dictionary;on the other hand,the pretraining language model BERT is used as general knowledge,and then domain knowledge and general knowledge are integrated into the model.In addition,convolution neural network is introduced to improve the context modeling ability of the model.In this paper,experiments are carried out on multiple datasets.The experimental results show that knowledge fusion can effectively improve the effect of medical entity recognition.
作者 刘龙航 赵铁军 LIU Longhang;ZHAO Tiejun(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)
出处 《智能计算机与应用》 2021年第3期94-97,共4页 Intelligent Computer and Applications
关键词 实体识别 序列标注模型 融合知识 entity recognition sequence labeling model knowledge fusion
  • 相关文献

参考文献2

二级参考文献26

  • 1俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:160
  • 2Burr Settles. Biomedical named entity recognition using conditional random fields and rich feature sets[C]//Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications. Geneva, Switzerland ; COLING, 2004 : 104 -- 107.
  • 3Hieuxuan. FlexCRFs, flexible conditional random fields [EB/OL]. http,//www, jaist, ae. jp. html.
  • 4中国科学院计算技术研究所.汉语词法分析工具ICT-CLAS[EB/0L].http://www.nlp.org.cn/.
  • 5Zhang Leo Maximum entropy modeling toolkit for python and C+ + [EB/OL]. 2007-07. http:Hhomepages, inf. ed. ac. uk/s0450736/maxent_toolkit, html.
  • 6Chang Chihchung, Lin Chihjen. LIBSVM -- a library for support vector machines[EB/OL], http://www, csie.ntu. edu. tw/-cjlin/libsvm.
  • 7Doan A,Naughton JF,Ramakrishnan R,et al.Information extraction challenges in managing unstructured data[J].ACM SIGMOD Record,2008,37(4):14-20.
  • 8Vlachos A,Gasperin C.Bootstrapping and evaluating named entity recognition in the biomedical domain[C]//Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology.New York:Association for Computational Linguistics Morristown,2006:138-145.
  • 9Bundschus M,Dejori M,Stetter M,et al.Extraction of semantic biomedical relations from text using conditional random fields[J].BMC Bioinformatics,2008,9:207.
  • 10Leaman R,Gonzalez GR.BANNER:An executable survey of advances in biomedical named entity recognition[C]//Proceedings of Pacific Symposium on Biocomputing.Hawaii:World Scientific Publishing Co.Pte.Ltd,2008:652-663.

共引文献78

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部