摘要
随着健康医疗数据的快速积累,数据驱动的医疗分析越来越受重视,合适的医疗活动表征对这些分析至关重要。然而,当前大多数表征方法缺乏对医疗数据时序性、数值敏感性的考虑,影响了分析方法的效果和可解释性。该文针对住院病例,提出了一种基于主题模型加强的医疗活动表征学习方法,该方法利用活动间时序关系和主题分配情况,构建了一个无监督学习的多层感知机模型。在大规模真实住院数据集上的测试结果表明:该方法所得表征可以有效提升疾病聚类、后续活动预测、剩余住院天数预测3项医疗分析任务的效果,同时表征具有良好的医学可解释性。
With the explosion of the amount of medical data,data-driven medical analyses are receiving increasing attention.Proper representation of medical activities is crucial for such analyses.However, most existing representations are designed without considering the temporality and numerical sensitivity of medical data,which limits the performance and interpretability of the analysis tasks.This paper presents a representation learning approach for medical activities that is enhanced by topical modeling for inpatient data.The approach leverages the temporal relations between activities and the topic assignment to construct a multilayer perceptron model.Evaluations using large real data sets demonstrate that this approach significantly improves three typical medical analysis tasks,while providing medical interpretations.
作者
徐啸
王灜
金涛
王建民
XU Xiao;WANG Ying;JIN Tao;WANG Jianmin(School of Software,Tsinghua University,Beijing 100084,China)
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2019年第3期169-177,共9页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目(71690231)
关键词
表征学习
主题模型
多层感知机
医疗分析
representation learning
topic modeling
multilayer perceptron
medical analyses