期刊文献+

面向临床科研的医疗事件模型与开放数据集合构建 被引量:3

Construction of An Open Dataset for Clinical Event Graph
下载PDF
导出
摘要 基于电子病历观察性数据的真实世界研究成为目前临床科研的热点。然而关系数据模型无法直接支撑起科研应用中医疗事件的时序关系表示以及知识融合的查询需求。针对上述问题,该文提出了一种新的基于RDF的医疗观察性数据表示模型,该模型可以清晰地表示临床检查、诊断、治疗等多种事件类型以及事件的时序关系。对来源于医院的电子病历数据,经过数据预处理、数据模式转换、时序关系构建以及知识融合4个步骤建立事件图谱。具体地,使用三家上海三甲医院的电子病历数据,构建了包括3个专科、173395个医疗事件以及501335个事件时序关系的医疗数据集,并融合了5313个中文医疗知识库概念。基于临床文献与医生科研需求,该文根据公共卫生流行病学的病因研究、治疗研究等类型,分别提供了针对本数据集的40个问题示例,并将其中的部分问题与传统关系数据库在查询的构建与执行方面进行了实验比对,论证了该事件图谱的优越性。该数据集遵循开放链接标准,在OpenKG上发布并提供了在线访问的SPARQL站点,链接为https://peg.ecustnlplab.com/dataset.html。 Clinical research based on observational data of electronic medical records has become a hot topic.In this paper,a new representation model of medical observation data based on RDF is proposed.The model can clearly represent multiple event types such as clinical examination,diagnosis,treatment as well as temporal relationships between events.Base on electronic medical records from hospitals,clinical event graphs are constructed by four steps:data preprocessing,RDF format conversion,time sequence construction and knowledge fusion.Specifically,using the electronic medical records of three first-class hospitals in Shanghai,we constructed a medical dataset including three specialties,173395 medical events,501335 temporal relationships of events,and linked with 5313 concepts in the knowledge base.This paper further provides 40 sample queries for clinical retrospective research including etiology analysis and treatment analysis,with demonstration in contrast to the traditional database in terms of query formulation and retrieval process.The dataset follows the Open Link Standard and is published on OpenKG with online SPARQL site(https://peg.ecustnlplab.com/dataset.html).
作者 刘旭利 金季豪 阮彤 高大启 殷亦超 葛小玲 LIU Xuli;JIN Jihao;RUAN Tong;GAO Daqi;YIN Yichao;GE Xiaoling(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China;Shanghai Shuguang Hospital,Shanghai University of Traditional Chinese Medicine,Shanghai 200021,China;The Children's Hospital of Fudan University,Shanghai 201108,China)
出处 《中文信息学报》 CSCD 北大核心 2020年第11期37-48,共12页 Journal of Chinese Information Processing
基金 国家重大新药创制项目(2019ZX09201004) 基于上海区域卫生信息平台的复旦儿科医联体互联网医院项目(201701013)。
关键词 电子病历数据 病人事件图谱 知识融合 electronic medical record patient event graph knowledge fusion
  • 相关文献

参考文献2

二级参考文献67

  • 1Deerwester S C, Dumais S T, Landauer T K, et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 1990.
  • 2Hofmann T. Probabilistic latent semantic indexing//Proceedings of the 22nd Annual International SIGIR Conference. New York: ACM Press, 1999:50-57.
  • 3Blei D, Ng A, Jordan M. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
  • 4Griffiths T L, Steyvers M. Finding scientific topics//Proceedings of the National Academy of Sciences, 2004, 101: 5228 5235.
  • 5Steyvers M, Gritfiths T. Probabilistic topic models. Latent Semantic Analysis= A Road to Meaning. Laurence Erlbaum, 2006.
  • 6Teh Y W, Jordan M I, Beal M J, Blei D M. Hierarchical dirichlet processes. Technical Report 653. UC Berkeley Statistics, 2004.
  • 7Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 1977, B39(1): 1-38.
  • 8Bishop C M. Pattern Recognition and Machine Learning. New York, USA: Springer, 2006.
  • 9Roweis S. EM algorithms for PCA and SPCA//Advances in Neural Information Processing Systems. Cambridge, MA, USA: The MIT Press, 1998, 10.
  • 10Hofmann T. Probabilistic latent semantic analysis//Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence. Stockholm, Sweden, 1999:289- 296.

共引文献320

同被引文献15

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部