摘要
对医疗数据进行挖掘分析生成疾病治疗的标准过程模型,或者为治疗方案制定提供决策支持,是当前研究热点之一。基于历史患者的用药数据对疾病的药物治疗过程模型进行挖掘,并提出一种过程模型与用户体征数据相融合的药物推荐方法。具体而言,对于给定的疾病种类,首先利用隐含狄利克雷分布LDA主题模型对患者用药数据进行训练,得到药物治疗的功效主题以及各个诊疗日的药物功效主题分布;然后,对患者各个诊疗日的功效主题分布进行聚类,将患者的药物治疗过程转换为药物功效组合标签序列,在此基础上构建药物治疗过程的概率后缀树模型;最后,基于概率后缀树计算各节点后续治疗所采用药物功效组合的概率分布,将其与病人的体征向量作为联合特征,病人真实用药对应的功效组合作为分类标签,使用XGBoost的分类方法训练模型,并利用该模型进行患者药物推荐。以MIMIC-Ⅲ数据库中糖尿病患者的处方日志和体征数据为例,对所提方案的可行性和有效性进行了评估。
Mining medical data to generate a standard treatment process model of a disease,or to provide decision support for treatment plan making is one of the research hotspots.Based on the drug data during hospitalization,by analyzing the drug treatment process model of the diseases,a new drug recommendation method involving the data of medication and signs of current patients was proposed.Specifically,for a given disease,a daily medication list was generated based on historical patient prescribing data.Latent Dirichlet Allocation(LDA)topic model was used to train the medication data of all patients,so as to obtain the efficacy topic of drug therapy and the distribution of efficacy topic of each treatment day.Moreover,with the topic distribution of patients efficacy of each day clustered,the patients drug treatment process was transformed into drug efficacy combination label sequence,on which the probability suffix tree model of drug treatment process was built.The probability distribution of future efficacy predicted by probability suffix tree was computed and was taken as features with patient s sign vector,the combination of efficacy corresponding to the actual medication of patients were seen as classification labels.The XGBoost was used to train the classification model and make patient drug recommendation.The feasibility and validity of the proposed method were evaluated by using the prescription log and characteristic data of patients with diabetes in MIMIC-Ⅲdatabase.
作者
李鹏飞
鲁法明
包云霞
曾庆田
朱冠烨
LI Pengfei;LU Faming;BAO Yunxia;ZENG Qingtian;ZHU Guanye(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China;College of Electronic and Information Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
出处
《计算机集成制造系统》
EI
CSCD
北大核心
2020年第6期1668-1678,共11页
Computer Integrated Manufacturing Systems
基金
国家自然科学基金资助项目(61602279,61472229)
山东省科技发展计划资助项目(2016ZDJS02A11)
山东省泰山学者工程专项基金资助项目(ts20190936,tsqn201909109)
国家海洋局海洋遥测工程技术研究中心开放基金资助项目(2018002)
山东省博士后创新专项资金资助项目(201603056)
山东省高等学校青创科技支持计划资助项目(2019KJN024)
山东科技大学领军人才与优秀科研团队计划资助项目(2015TDJH102)
山东科技大学研究生科技创新资助项目(SDKDYC190335)。
关键词
过程挖掘
LDA主题模型
概率后缀树
XGBoost算法
过程模型
process mining
Latent Dirichlet allocation topic model
probabilistic suffix tree
XGBoost algorithm
process model