摘要
随着电子病历数据量的快速增长,如何深层次、高效率地利用电子病历资源成为越来越迫切需要解决的问题.从真实病历出发,研究电子病历的医学实体识别问题,为计算机更好地辅助医疗奠定基础.通过人工标注的108份心血管科的真实病历数据与3类特征模板,运用条件随机场和双向长短时记忆网络联合条件随机场对心血管科电子病历疾病命名实体抽取的实验,并进行比较分析.结果表明,结合合适的特征模版,条件随机场模型有更好的抽取性能,是一种较为适用的病历命名实体抽取方法.
With the rapid increase in the amount of electronic medical record data, how to use electronic medical record resources in depth and efficiency has become more and more important. This article starts from the real medical record, through the manual annotation of 108 medical records of real medical records and three types of feature templates, using conditional random field and the bidirectional long-term short-term memory network conditional random field. Experiments on the extraction of cardiovascular electronic disease named entities and comparative analysis are conducted. The results show that CRF has better extraction performance, and that it is a more suitable method for extracting medical record named entities for small-scale and partially formatted medical record texts.
作者
杨荣根
王博
龚乐君
Yang Ronggen;Wang Bo;Gong Lejun(College of Intelligent Science and Control Engineering,Jinling Institute of Technology,Nanjing 211169,China;Big Data Security and Intelligent Processing Key Laboratory of Jiangsu Province,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处
《南京师范大学学报(工程技术版)》
CAS
2022年第1期81-85,共5页
Journal of Nanjing Normal University(Engineering and Technology Edition)
关键词
电子病历
命名实体抽取
条件随机场
特征模板
双向长短时记忆网络
electronic medical record
named entity extraction
conditional random field
feature template
bidirectional long-term short-term memory network