摘要
基于电子病历观察性数据的真实世界研究成为目前临床科研的热点。然而关系数据模型无法直接支撑起科研应用中医疗事件的时序关系表示以及知识融合的查询需求。针对上述问题,该文提出了一种新的基于RDF的医疗观察性数据表示模型,该模型可以清晰地表示临床检查、诊断、治疗等多种事件类型以及事件的时序关系。对来源于医院的电子病历数据,经过数据预处理、数据模式转换、时序关系构建以及知识融合4个步骤建立事件图谱。具体地,使用三家上海三甲医院的电子病历数据,构建了包括3个专科、173395个医疗事件以及501335个事件时序关系的医疗数据集,并融合了5313个中文医疗知识库概念。基于临床文献与医生科研需求,该文根据公共卫生流行病学的病因研究、治疗研究等类型,分别提供了针对本数据集的40个问题示例,并将其中的部分问题与传统关系数据库在查询的构建与执行方面进行了实验比对,论证了该事件图谱的优越性。该数据集遵循开放链接标准,在OpenKG上发布并提供了在线访问的SPARQL站点,链接为https://peg.ecustnlplab.com/dataset.html。
Clinical research based on observational data of electronic medical records has become a hot topic.In this paper,a new representation model of medical observation data based on RDF is proposed.The model can clearly represent multiple event types such as clinical examination,diagnosis,treatment as well as temporal relationships between events.Base on electronic medical records from hospitals,clinical event graphs are constructed by four steps:data preprocessing,RDF format conversion,time sequence construction and knowledge fusion.Specifically,using the electronic medical records of three first-class hospitals in Shanghai,we constructed a medical dataset including three specialties,173395 medical events,501335 temporal relationships of events,and linked with 5313 concepts in the knowledge base.This paper further provides 40 sample queries for clinical retrospective research including etiology analysis and treatment analysis,with demonstration in contrast to the traditional database in terms of query formulation and retrieval process.The dataset follows the Open Link Standard and is published on OpenKG with online SPARQL site(https://peg.ecustnlplab.com/dataset.html).
作者
刘旭利
金季豪
阮彤
高大启
殷亦超
葛小玲
LIU Xuli;JIN Jihao;RUAN Tong;GAO Daqi;YIN Yichao;GE Xiaoling(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China;Shanghai Shuguang Hospital,Shanghai University of Traditional Chinese Medicine,Shanghai 200021,China;The Children's Hospital of Fudan University,Shanghai 201108,China)
出处
《中文信息学报》
CSCD
北大核心
2020年第11期37-48,共12页
Journal of Chinese Information Processing
基金
国家重大新药创制项目(2019ZX09201004)
基于上海区域卫生信息平台的复旦儿科医联体互联网医院项目(201701013)。
关键词
电子病历数据
病人事件图谱
知识融合
electronic medical record
patient event graph
knowledge fusion