期刊文献+

基于改进LDA的在线医疗评论主题挖掘 被引量:23

Identifying Topics of Online Healthcare Reviews Based on Improved LDA
下载PDF
导出
摘要 对利用主题模型挖掘医疗服务主题进行了深入研究,针对LDA主题模型用于医疗评论主题挖掘中存在的语义稀疏、共现信息不足等问题,提出一种基于词共现分析与LDA主题模型结合的CO-LDA模型.首先使用词共现分析方法对评论语料库进行分析,得到词共现矩阵.其次利用LDA主题模型对语料评论进行建模表示,挖掘出患者对医疗服务的关注点.基于平均最小JS距离、平均肯德尔等级相关系数τ_b及平均TF-IDF 3个指标对比CO-LDA模型与传统LDA模型在医疗评论主题挖掘中的应用效果,实验最终表明CO-LDA模型识别主题的一致性和主题质量优于LDA模型.将实验结果与中国《医院评价标准》进行对比,一致性较高,说明基于CO-LDA的在线医疗评论主题挖掘方法的有效性. An in-depth research was conducted on the use of topic models to identify the topics of healthcare services. In view of semantic sparseness and the lack of co-occurrence information in the special extraction of healthcare reviews in the LDA topic model, a CO-LDA model was proposed based on word co-occurrence analysis combined with LDA topic model. Firstly, the word co-occurrence analysis method was used to analyze the corpus of the review and the word co-occurrence matrix was obtained. Secondly, the LDA topic model was used to represent corpus reviews, and then the hierarchical clustering algorithm was used to classify the features. Finally, patients’ focus on healthcare service quality factors was identified. Based on the average minimum JS distance, the average Kendall correlation coefficient and the average TF-IDF, in this paper the CO-LDA model was compared with the traditional LDA model. The experiment finally shows that the recognition theme consistency of CO-LDA model is better than that of the LDA model. Through the comparison of the experimental results with the “Hospital Evaluation Standards” in China, it is found that the consistency of the former was high, which explains the effectiveness of the CO-LDA-based online medical review topic mining method.
作者 高慧颖 刘嘉唯 杨淑昕 GAO Hui-ying;LIU Jia-wei;YANG Shu-xin(School of Economics and Management, Beijing Institute of Technology,Beijing 100081,China)
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2019年第4期427-434,共8页 Transactions of Beijing Institute of Technology
基金 国家自然科学基金资助项目(71572013)
关键词 主题抽取 医疗服务 语义稀疏 CO-LDA 词共现分析 topic extraction healthcare service semantic sparse CO-latent dirichlet allocation word co-occurrence analysis
  • 相关文献

同被引文献292

引证文献23

二级引证文献75

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部