摘要
电子病历是医院信息化发展的产物,其中包含了丰富的医疗信息和临床知识,是辅助临床决策和药物挖掘等的重要资源.因此,如何高效地挖掘大量电子病历数据中的信息是一个重要的研究课题.近些年来,随着计算机技术尤其是机器学习以及深度学习的蓬勃发展,对电子病历这一特殊领域数据的挖掘有了更高的要求.电子病历综述旨在通过对电子病历研究现状的分析来指导未来电子病历文本挖掘领域的发展.具体而言,综述首先介绍了电子病历数据的特点和电子病历的数据预处理的常用方法;然后总结了电子病历数据挖掘的4个典型任务(医学命名实体识别、关系抽取、文本分类和智能问诊),并且围绕典型任务介绍了常用的基本模型以及研究人员在任务上的部分探索;最后结合糖尿病和心脑血管疾病2类特定疾病,对电子病历的现有应用场景做了简单介绍.
Electronic medical records(EMR),produced with the development of hospital informa-tionization and contained rich medical information and clinical knowledge,play important roles in guiding and assisting clinical decision-making and drug mining.Therefore,how to efficiently mine important information in a large amount of electronic medical records is an essential research topic.In recent years,with the vigorous development of computer technology,especially machine learning and deep learning,data mining in the special field of electronic medical records have been raised to a new height.This review aims to guide future development in the field of electronic medical record text mining by analyzing the current status of electronic medical record research.Specifically,this paper begins with an introduction to the characteristics of electronic medical record data and introduces how to preprocess electronic medical record data;then four typical tasks around electronic medical record data mining(medical named entity recognition,relationship extraction,text classification and smart interview)introduce popular model methods;finally,from the perspective of the application of electronic medical record data mining in characteristic diseases,two specific diseases of diabetes and cardio-cerebrovascular diseases are combined and a brief introduction to the existing application scenarios of electronic medical records is given.
作者
吴宗友
白昆龙
杨林蕊
王仪琦
田英杰
Wu Zongyou;Bai Kunlong;Yang Linrui;Wang Yiqi;Tian Yingjie(School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100049;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049;Research Center on Fictitious Economy and Data Science,Chinese Academy of Sciences(University of Chinese Academy of Sciences),Beijing 100190;Key Laboratory of Big Data Mining and Knowledge Management,Chinese Academy of Sciences(University of Chinese Academy of Sciences),Beijing 100190;Sino-Danish College,University of Chinese Academy of Sciences,Beijing 100049)
出处
《计算机研究与发展》
EI
CSCD
北大核心
2021年第3期513-527,共15页
Journal of Computer Research and Development
基金
国家自然科学基金项目(71731009,61472390)
中国科学院科技服务网络计划项目(KFJ-STS-ZDTP-060)。
关键词
电子病历
自然语言处理
数据挖掘
机器学习
深度学习
electronic medical records
natural language processing
data mining
machine learning
deep learning