期刊文献+

考虑主题兴趣和领域权威的问答社区专家推荐研究

Expert Recommendation in Q&A Community Based on Topic Interest and Domain Authority
原文传递
导出
摘要 【目的】对用户历史问答文本实现考虑上下文语义信息的主题识别,进而提升问答社区专家推荐的准确度。【方法】通过构建BERT-LLDA模型,将BERT模型与Labeled-LDA主题模型相结合,充分利用标签信息对用户历史问答文本进行向量化,通过降维和主题聚类实现考虑上下文语义信息的主题识别,获得用户的主题兴趣概率分布;根据主题兴趣挖掘结果构建主题敏感PageRank算法(TSPR),并加入用户质量权重迭代计算用户的领域权威;基于此得到考虑主题兴趣和领域权威的问答社区专家推荐算法TIDARank,为新问题推荐潜在回答专家。【结果】基于Stack Exchange公开数据集,BERT-LLDA模型经过主题聚类后相比TF-IDF、BERT、BERT-LDA等对比模型具有更高的轮廓系数(0.5756)和主题连贯性(0.4766);TIDARank算法的最佳回答者命中率ACC@20和平均倒数排名MRR@20分别为0.5807和0.2430,相比于表现最优的对比模型BiLSTM+TSPR分别提升0.145和0.081。【局限】在链接分析中未考虑用户的活跃情况。【结论】BERT-LLDA模型不仅可以优化主题聚类的效果,且有助于提升问答社区专家推荐的性能。 [Objective]This paper aims to enhance the accuracy of expert recommendations in Q&A communities based on topics of users’historical Q&A texts and contextual information.[Methods]First,we combined the BERT model with the Labeled-LDA model.Then,we utilized the label information to vectorize users’historical Q&A texts.Third,we identified contextual topics with dimension reduction and topic clustering.We also obtained the probability distribution of the expert’s topic interests.Fourth,based on the results of topic interest mining,we constructed the Topic Sensitive PageRank Algorithm(TSPR).We used the users’quality weight to calculate their domain authority iteratively.From this,we proposed the TIDARank algorithm for expert recommendation.[Results]Based on the Stack Exchange public dataset,the BERT-LLDA model outperformed TF-IDF,BERT,and BERT-LDA models on silhouette coefficient(0.5756)and topic coherence(0.4766).The ACC@20 and MRR@20 of TIDARank reached 0.5807 and 0.2430,respectively,improved by 0.145 and 0.081 compared with the bestperforming Bi-LSTM+TSPR baseline algorithm.[Limitations]We did not consider user activity in link analysis.[Conclusions]The BERT-LLDA model could optimize topic clustering for question-answering texts and improve the performances of expert recommendations in Q&A communities.
作者 李明珠 米传民 苟小义 肖琳 Li Mingzhu;Mi Chuanmin;Gou Xiaoyi;Xiao Lin(College of Economics and Management,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)
出处 《数据分析与知识发现》 EI CSCD 北大核心 2024年第5期68-79,共12页 Data Analysis and Knowledge Discovery
基金 教育部人文社会科学基金项目(项目编号:20YJC630163)的研究成果之一。
关键词 社区问答 专家推荐 BERT Labeled-LDA PAGERANK Community Question Answering Expert Recommendation BERT Labeled-LDA PageRank
  • 相关文献

参考文献9

二级参考文献63

共引文献141

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部