摘要
【目的】通过挖掘文本特征寻找某医生的相似医生,以相似医生的特征为基础对医生进行标注,丰富对医生特征的描述。【方法】利用Word2Vec词向量模型对医生的咨询文本、文章标题与咨询范围进行向量表示,在此基础上挖掘相似医生;进而分析挖掘的相似医生的特征,对标注的目标医生进行协同标注。【结果】基于咨询文本、文章标题与咨询范围的医生标注结果,准确率分别为0.667、0.252与0.708,混合不同文本进行标注的准确率为1.000。【局限】对文本语义特征的挖掘不够深入,以单一文本进行标注的准确率与召回率有待提高。【结论】基于咨询文本产生的标签与患者即时需求较为紧密,基于文章标题产生的标签与医生兴趣具有较强联系,基于咨询范围与混合不同文本所得标签具有较高的准确率,从文本挖掘出发进行医生的协同标注能在一定程度上推荐合适的标签。
[Objective]This paper tries to find similar doctors and improve the descriptions of their characteristics.[Methods]We generated vector representation for each doctor’s consulting texts,article titles and service scopes with the Word2Vec model,which helped us identify similar doctors.Then,we analyzed their common characteristics and collaboratively tag these doctors.[Results]The accuracy of tagging results based on doctor’s consulting texts,article titles and services were 0.667,0.252 and 0.708,respectively.The accuracy of tagging results based on mixed texts was 1.000.[Limitations]The performance of single-text based tagging needs to be improved.[Conclusions]Tags based on consultation texts are closely related to the immediate needs of patients,while tags based on article titles are strongly related to doctor's interests.Tags obtained from their services and mixed texts are more accurate.
作者
叶佳鑫
熊回香
童兆莉
孟秋晴
Ye Jiaxin;Xiong Huixiang;Tong Zhaoli;Meng Qiuqing(School of Information Management,Central China Normal University,Wuhan 430079,China;Hubei Communication Technical College,Wuhan 430079,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2020年第6期118-128,共11页
Data Analysis and Knowledge Discovery
基金
华中师范大学中央高校基本科研业务费(人文社会科学类)重大项目“基于语义网的在线健康信息的挖掘与推荐研究”(项目编号:CCNU19Z02004)
华中师范大学优秀博士学位论文培育计划项目(项目编号:2019YBZZ096)的研究成果之一。