摘要
基于电子病历命名实体识别对智慧医疗和医疗知识图谱的构建具有重要意义,提出一种基于医疗类别的命名实体识别方法。首先,针对电子病历语料中实体特点进行深度挖掘,将电子病历分为4类医疗类别;然后,对各医疗类别分别构建特征集,并使用条件随机场模型对身体部位、症状和体征、检查与检验、疾病与诊断、治疗等5类命名实体进行命名实体识别;最后,将基于医疗类别特征集识别效果和通用特征集的识别结果进行对比。实验结果表明,基于医疗类别的电子病历命名实体识别效果显著提升,可以满足应用需求。
Based on the named entity recognition in electronic medical records is of great significance to medical treatment AI and the construction of medical knowledge graph,a proposal has been made of a named entity recognition method based on medical categories. First, the electronic medical record is to be divided into 4 categories according to the entity characteristics of the corpus of electronic medical records. Then, the feature sets are to be constructed respectively for the medical categories, followed by an identification of the named entities of such five named entities as body parts, symptoms and signs, inspection and test, disease and diagnosis, and treatment by using the conditional random field model. Finally, a comparison has been made between the recognition results based on medical class feature sets and the general feature sets. The results show that the effect of named entity recognition based on medical categories has been significantly improved, enabling it to meet the application requirement effectively.
作者
李飞
朱艳辉
王天吉
徐啸
冀相冰
LI Fei1,2, ZHU Yanhui1, 2, WANG Tianji1,2, XU Xiao1,2, JI Xiangbing1, 2(1. College of Computer, Hunan University of Technology, Zhuzhou Hunan 412007, China; 2. Hunan Key Laboratory of Intelligent Information Perception and Processing Technology, Zhuzhou Hunan 412007, China)
出处
《湖南工业大学学报》
2018年第4期61-66,共6页
Journal of Hunan University of Technology
基金
国家自然科学基金资助项目(61402165)
湖南省教育厅基金资助重点项目(15A049)
国家工商行政管理总局科研基金资助项目(2014GSZJWT001KT006)
湖南工业大学科研基金资助重点项目(17ZBLWT001KT006)
湖南省研究生科研创新基金资助项目(CX2017B688)
关键词
电子病历
命名实体识别
条件随机场
医疗类别
electronic medical record
named entity recognition
conditional random field
medical category