摘要
可计算医学知识强调将科学出版物中人读的知识格式通过抽取和编程转化为机器可执行的知识格式,是促进知识大规模应用的重要手段,其不仅为情报学领域开展知识计算研究提供了新范式,也为数字图书馆存储和管理数字化知识对象提出了新需求。可计算医学知识的基本概念包括两个方面,一是知识的表示形式可计算化,二是知识在实践中“可执行”,两者缺一不可。本文归纳提出了可计算医学知识的两条实现路径。一是数据挖掘,从表格等结构化数据中形成计算机可直接调用和执行的数字化知识对象(如疾病风险模型计算器),用知识网格(K-Grid)管理,提供辅助诊断;二是文本挖掘,从临床指南、医学文献的知识主张等非结构化文本中抽取三元组,并纳入三元组背后的证据和数据,计算置信度,用图数据库(K-Graph)来管理,实现知识单元的查询和输出,提供治疗建议等。最后讨论了可计算医学知识对于深化情报学研究的积极意义及其在促进知识转化、知识发现和循证决策中的应用场景,以期为国内学术界开展医学知识计算引入跨学科研究思路,也为我国建设“从数据到知识、从知识到实践、从实践再到数据”的学习型健康医疗体系提供技术方法基础与实现路径参考。
“Computable knowledge”focuses on transforming human-readable knowledge into machine-executable forms by extracting and programming processes on digital knowledge objects.It can be regarded as the“keystone”in supporting the massive knowledge application in the cycle of learning health systems,i.e.,“from data to knowledge,from knowledge to practice,and then from practice to data.”This concept has become a new field of research in health data science,and it also provides a new paradigm for digital library and knowledge computation research in the field of library and information science.This study proposes two approaches to making medical knowledge computable.One is a data-mining-driven approach.Computable knowledge can be extracted from the data in tables of medical literature,expressed in code,and managed in the Knowledge Grid(K-Grid).For example,a machine-executable version of the predictive model can be encoded in any appropriate computer language.When given an instance of data about an individual,this encoded model can quickly and accurately generate a risk prediction or useful advice.The second is a text-mining-driven approach that extracts Subject-Predicate-Object(SPO)triples from an unstructured text,such as the assertions in clinical guidelines and medical literature.By incorporating the evidence and data into a given SPO triple,we can calculate the confidence score for such a knowledge unit.The SPO triples can be stored in graph databases(K-Graph)for automatic question answering for a specific condition,such as treatment recommendations ranked by the confidence level to support medical intervention decision-making.Several challenges for the development and application of computable medical knowledge have been discussed.We hope to introduce an interdisciplinary approach to investigating computable medical knowledge and provide conceptual and technological preparations for the learning health system in China.
作者
杜建
孔桂兰
李鹏飞
白永梅
张路霞
Du Jian;Kong Guilan;Li Pengfei;Bai Yongmei;Zhang Luxia(National Institute of Health Data Science,Peking University,Beijing 100191;Advanced Institute of Information Technology,Peking University,Hangzhou 226019)
出处
《情报学报》
CSSCI
CSCD
北大核心
2021年第11期1221-1233,共13页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金面上项目“不确定性科学知识表示与计量的理论、方法与应用研究:以医学为例”(72074006)
中国科协青年人才托举工程项目“医学知识结构化表示与智能化计算模型研究”(2017QNRC001)
国家自然科学基金重大研究计划培育项目“基于大数据的慢性肾脏疾病患者跨区域就诊可视化呈现及管理决策研究”(91846101)
北京市自然科学基金面上项目“基于真实世界数据的糖尿病慢性并发症及死亡风险预测方法与系统研究”(7212201)
北京大学医学部-密歇根大学医学院转化医学与临床研究联合研究所项目(BMU2020JI011)。
关键词
可计算知识
数字化知识对象
医学知识图谱
机器可执行
共享
computable knowledge
digital knowledge objects
medical knowledge graph
machine-executable
sharing