期刊文献+

基于词嵌入结合BiLSTM-CRF模型的病历实体识别 被引量:3

Recognition of medical record entities based on word embedding combined with the BiLSTM-CRF model
下载PDF
导出
摘要 在传统中文电子病历的命名实体识别任务中,针对医疗实体边界不清、实体嵌套、语句成分缺失、高度依赖人工提取特征等问题,提出基于词嵌入结合BiLSTM-CRF模型的中文电子病历命名实体识别模型。将电子病历文本数据集进行脱敏处理及序列标注等数据预处理,结合词嵌入匹配病历文本序列进行词向量化表示,利用BiLSTM神经网络对前后向病历文本进行空间语义建模,获取文本序列的语义特征,然后利用CRF预测实体标签输出。实验结果表明,改进后的BiLSTM-CRF模型显著提高了病历实体识别的准确率和召回率。 In the task of recognition of the named entity of traditional Chinese electronic medical records,to solve problems such as the medical entity boundary is unclear,entity nesting,sentence components missing,and heavy reliance on manual extraction features,the named entity recognition model of Chinese electronic medical records based on word embedding combined with BILSTM-CRF model was proposed.The text data set of electronic medical records was desensitized and pre-processed with sequence labeling,and the vectorized representation of words was completed with word embedding by matching text sequence of medical records.BiLSTM neural network was used to model the spatial semantics of the backward and forward medical record text to obtain the semantic features of the text sequence.And then,CRF was used to predict the output of entity label.The experimental results show that the improved BiLSTMCRF model can increase the accuracy and recall rate of medical record entities recognition significantly.
作者 李超凡 马凯 Li Chaofan;Ma Kai(School of Medical Information and Engineering,Xuzhou Medical University,Xuzhou 221004,Jiangsu Province,China)
出处 《中国数字医学》 2022年第4期32-37,共6页 China Digital Medicine
基金 徐州市科技计划项目重点研发计划(KC21308) 江苏省研究生教育教学改革研究与实践课题(JGZZ19_065) 江苏省大学生创新创业项目(201810313047Y,201910313004Z)。
关键词 电子病历 命名实体识别 双向长短期记忆神经网络 条件随机场 Electronic medical record Named entity recognition Bidirectional long-short-term memory neural network Conditional random field
  • 相关文献

参考文献13

二级参考文献79

  • 1向晓雯,史晓东,曾华琳.一个统计与规则相结合的中文命名实体识别系统[J].计算机应用,2005,25(10):2404-2406. 被引量:37
  • 2俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:156
  • 3张晓艳,王挺,陈火旺.基于混合统计模型的汉语命名实体识别方法[J].计算机工程与科学,2006,28(6):135-139. 被引量:19
  • 4Doan A,Naughton JF,Ramakrishnan R,et al.Information extraction challenges in managing unstructured data[J].ACM SIGMOD Record,2008,37(4):14-20.
  • 5Vlachos A,Gasperin C.Bootstrapping and evaluating named entity recognition in the biomedical domain[C]//Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology.New York:Association for Computational Linguistics Morristown,2006:138-145.
  • 6Bundschus M,Dejori M,Stetter M,et al.Extraction of semantic biomedical relations from text using conditional random fields[J].BMC Bioinformatics,2008,9:207.
  • 7Leaman R,Gonzalez GR.BANNER:An executable survey of advances in biomedical named entity recognition[C]//Proceedings of Pacific Symposium on Biocomputing.Hawaii:World Scientific Publishing Co.Pte.Ltd,2008:652-663.
  • 8Leaman R,Miller C,Gonzalez G.Enabling recognition of diseases in biomedical text with machine learning:Corpus and benchmark[C]//Proceedingsof the 3rdInternational Symposium on Lagauges in Biology and Medicine.Seogwipo-si.LBM,2009:82-89.
  • 9Tsai Tzong-ham,Chou Wen-Chi,Wu Shih-Hung,et al.Integrating Linguistic Knowledge into a Conditional Random Field Framework to Identify Biomedical Named Entities[J].Expert Systems with Applications,2006,30(1):117-128.
  • 10Sun ChengJie,Guan Yi,Wang XiaoLong,et al.Biomedical named entities recognition using conditional random fields model[J].Lecture notes in computer science,2006,4223:1279-1288.

共引文献271

同被引文献26

引证文献3

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部