摘要
中文产科电子病历中蕴含着大量的医疗知识和健康信息,电子病历的信息抽取及辅助诊断对提高人口的生育健康水平具有重要意义。电子病历中,首次病程记录的入院诊断是根据主诉、辅助检查、查体等信息得出的。通常情况下诊断中包含正常诊断、病理诊断及并发症而非单一结果。因此,该文将辅助诊断问题转化为多标记分类任务。在对产科电子病历的首次病程记录进行数据清洗和结构化后,规范化诊断结论,将LDA所抽取的文本特征与病历中的数字特征采用向量拼接的方法融合为新的特征,再按诊断结果出现的频次不同形成不同的多标记集,根据首次病程中部分信息进行辅助诊断,采用RAkEL、MLkNN、CC和BP-MLL方法进行多标记分类。实验结果表明,采用融合特征的多标记分类方法,能够提升中文产科电子病历辅助诊断的效果。
The information extraction and assistant diagnosis of obstetric EMRs is of great significance in improving the fertility level of the population.Since the admitting diagnosis in first course record of EMR is reasoned from the information of chief complaints,auxiliary examinations,physical examinations etc,we treat the diagnostic process into multi-label classification problem.The features of LDA extraction and the digital features of medical records are fused into new features by vector merging,and RAkEL,MLkNN,CC and BP-MLL are used for multi-label classification.The experimental results show that the proposed method can improve the assistant diagnosis of Chinese obstetric electronic medical records.
作者
马鸿超
张坤丽
赵悦淑
昝红英
庄雷
MA Hongchao;ZHANG Kunli;ZHAO Yueshu;ZAN Hongying;ZHUANG Lei(Information Engineering School,Zhengzhou University,Zhengzhou,Henan 450001,China;Industrial Technology Research Institute,Zhengzhou University,Zhengzhou,Henan 450001,China;The Third Affiliated Hospital of Zhengzhou University,Zhengzhou,Henan 450052,China)
出处
《中文信息学报》
CSCD
北大核心
2018年第5期128-136,共9页
Journal of Chinese Information Processing
基金
国家973课题(2014CB340504)
国家自然科学基金(61402419
60970083)
国家社会科学基金(14BYY096)
计算语言学教育部重点实验室开放课题
河南省科技厅基础研究项目(142300410231
142300410308)
河南省科技厅科技攻关项目(172102210478)
关键词
中文产科电子病历
数据清洗
辅助诊断
特征融合
多标记分类
Chinese obstetric electronic medical record
data cleaning
assistant diagnosis
features fusion
multi-la-bel classification