期刊文献+

中医病历术语识别方法探讨 被引量:2

Discussion on Methods of Terminology Recognition in TCM Medical Records
下载PDF
导出
摘要 目的探索中医领域利用少量标注语料进行电子病历中医学实体信息的命名实体识别(NER)研究工作,为更复杂的中医电子病历信息处理及深度学习方法在中医领域内的运用提供参考。方法分析中医电子病历词汇术语与一般的NER任务相比较的特殊性,对比了目前3种NER技术的优缺点,找寻适合中医电子病历医学术语的NER技术。结果长短时记忆神经网络(LSTM)是一种无监督学习模型,能有效利用序列数据中长距离依赖信息,特别适合处理文本序列数据;还可以和条件随机场(CRF)模型相结合,解决中医NER的难点。长短时记忆神经网络联合条件随机场模型(LSTM-CRF)可以在未标记的病历文本语料上无监督学习词语特征,不依赖于人工设计特征模板而达到自动提取患者症状、疾病、诱因等命名实体的目的。结论中医电子病历术语识别应利用多种命名实体识别技术,充分发挥这些技术的优势,提高模型识别准确性。 Objective To explore how to use the small amount of labeled corpora in the field of TCM to conduct research on named entity recognition(NER)of medical entity information in electronic medical records(EMR);To provide references for the application of more complex information processing of TCM EMR and indepth learning methods in the field of TCM.Methods Specificity of vocabulary and terminology of TCM EMR compared to general NER tasks was analyzed,and the advantages and disadvantages of the current three NER technologies were compared,so as to find the named entity recognition technologies suitable for medical terminology of TCM EMR.Results As an unsupervised learning model,long and short-term memory(LSTM)neural network could effectively utilize long-distance dependent information in sequential data,especially suitable for processing text sequence data.It could also be combined with conditional random field model(CRF)to solve the difficulty of NER in TCM.LSTM-CRF model could learn word features in unsupervised condition in unmarked medical record text corpus,and could automatically extract named entities such as symptoms,diseases and causes of patients without relying on the artificial design of feature templates.Conclusion TCM EMR should be applied to multiple NER technologies,making full use of the advantages of these technologies and improving the accuracy of model recognition.
作者 孙超 谢晴宇 SUN Chao;XIE Qing-yu(School of Traditional Chinese Medicine,Capital Medical University,Beijing 100069,China;Institute of Basic Research in Clinical Medicine,China Academy of Chinese Medical Sciences,Beijing 100700,China)
出处 《中国中医药图书情报杂志》 2020年第2期1-5,共5页 Chinese Journal of Library and Information Science for Traditional Chinese Medicine
基金 北京中医药“薪火传承3+3工程”崔锡章中医文化传承工作室。
关键词 命名实体识别 长短时记忆神经网络 条件随机场 中医电子病历 named entity recognition(NER) long and short-term memory(LSTM) conditional random fields TCM electronic medical records(EMR)
  • 相关文献

参考文献7

二级参考文献44

  • 1陈悦,陈超美,刘则渊,胡志刚,王贤文.CiteSpace知识图谱的方法论功能[J].科学学研究,2015,33(2):242-253. 被引量:7180
  • 2俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:157
  • 3李虹.中医语言的特点及其对中医英语表达的影响[J].上海中医药大学学报,2006,20(1):69-71. 被引量:20
  • 4顾铮,顾平.信息抽取技术在中医研究中的应用[J].医学信息(西安上半月),2007,20(1):27-30. 被引量:11
  • 5张润顺,王映辉,姚乃礼,刘保延,姜在旸,周雪忠,田琳.名老中医电子病历中病史动态结构化数据录入规范[J].中国中医药信息杂志,2007,14(3):100-101. 被引量:34
  • 6Zhou Xuezhong, Peng Yonghong, Liu Baoyan. Text Mining for Traditional Chinese Medical Knowledge Discovery: A Survey [J]. Journal of Biomedical Informatics,2010,43(4):650-660.
  • 7Zhou Xuezhong, Liu Baoyan, Wang Yinghui, et al. Building Clinical Data Warehouse for Traditional Chinese Medicine Knowledge Discovery [C]/ / Proc. of International Conference on BioMedical Engineering and Informatics. [S. l.]:IEEE Press,2008:615-620.
  • 8Zhou Xuezhong, Chen Shibo, Liu Baoyan, et al. Development of Traditional Chinese Medicine Clinical Data Warehouse for Medical Knowledge Discovery and Decision Support[J]. Artificial Intelligence in Medicine, 2010,48(2/ 3):139-152.
  • 9Lafferty J D,McCallum A,Pereira F C N. Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data [C]/ / Proc. of the 18th International Conference on Machine Learning. [S. l.]: Morgan Kaufmann Publishers Inc. ,2001:282-289.
  • 10熊 英. 中文自然语言理解中基于条件随机场理论的词法分析研究[D]. 上海:上海交通大学,2009.

共引文献96

同被引文献33

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部