摘要
为了给医生及病人安全、合理、高效用药提供决策支持,提出了一种基于LDA(Latent Dirichlet Allocation)的用药分析方法 Ma LDA(Medication Analysis based on LDA)。该方法结合了用药记录和就诊记录,将药物看作文档、药物功能看作主题、疾病看作词语,通过主题模型LDA发现隐含的药物功能,通过药物功能,将相关药物、相关疾病和药物与疾病联系起来。根据药物对药物功能的分布对药物进行聚类,每一类药物被相关的疾病所描述,进而对临床用药进行分析。Ma LDA不仅能发现临床用药中针对某一类疾病效用较好的药物,而且能发现隐含的联合用药。实验数据来源于上海市某医院137 510位病人的用药记录和就诊记录。实验结果证实了Ma LDA相对于其他方法在对电子就医记录进行用药分析的有效性。
To provide support for doctors and patients to use drugs in a safer, more rational and efficient way, this paper proposes a framework for medication analysis based on LDA(Latent Dirichlet Allocation), MaLDA(Medication Analysis based on the LDA). MaLDA combines the usage of medication records and diagnostic records, infers the function of each drug using topic-based inference model LDA, which regards a drug as a document, a function as a topic, and a disease as a word. As a result, related drugs, drug and disease, related diseases are associated by functions. Then clustering all drugs according to its distribution of functions, and each cluster is described by related diseases. Finally, it analyzes the clinical medication based on the results of clustering. The result generated by MaLDA can not only find the drug which is better in treatment, but also find the drug combination which lays the foundation for mining drug side effects and the complications of disease. The method is evaluated by using 137 510 patients’diagnostic records and medication records. The results justify the advantages of MaLDA over baseline methods on medication analysis.
作者
周靖
佘玉轩
熊赟
ZHOU Jing;SHE Yuxuan;XIONG Yun(School of Computer Science, Fudan University, Shanghai 201203, China;Shanghai Key Laboratory of Data Science, Shanghai 201203, China;Shanghai Key Laboratory of Financial Information Technology(Shanghai University of Finance and Economics),Shanghai 200433, China)
出处
《计算机工程与应用》
CSCD
北大核心
2016年第18期8-13,共6页
Computer Engineering and Applications
基金
国家高技术研究发展计划(863)(No.2015AA020105)
国家自然科学基金(No.91546105
No.71331005)
上海市科委基金(No.14511107302)
上海市数据科学重点实验室开放课题资助课题(No.201509060001)
NSFC-广东联合基金(第二期)超级计算科学应用研究专项资助
国家超级计算广州中心支持
关键词
数据挖掘
用药分析
主题模型
隐含的狄利克雷分布
data mining
medication analysis
topic model
Latent Dirichlet Allocation(LDA)