期刊文献+

基于LDA主题模型的用户特征预测研究 被引量:4

Research on User Traits Predicting Based on LDA Topic Model
下载PDF
导出
摘要 用户特征可以通过在线用户的点赞信息进行奇异值分解和Logistic回归有效预测,然而对新用户的特征预测却难以实现。为了解决该问题,提出了一种基于LDA主题模型的在线用户特征预测方法。首先使用LDA模型提取微博用户的点赞文本主题,然后基于主题对新用户的特征进行预测,最后与基于奇异值分解的传统方法比较预测结果。实验结果表明其F1值最高提升0.15,且计算时间平均缩短了69.09%。研究改进了点赞信息固有标签不能准确反映用户偏好的缺陷,避免了传统方法预测过程中仍需对新用户及其点赞信息重新计算的繁琐弊端,为用户特征分析提供了另一条可行途径。 User traits can be effectively predicted by singular value decomposition and Logistic Regression through online user’s‘Like’information.However,this method cannot predict new users’traits.To slove the problem,this paper proposes an online user traits predicting method based on LDA topic model.Firstly,the method extracted the Weibo user’s‘Like’text topic through LDA model.Then it predicted new user traits based on topic.Finally,the result is compared to the traditional method based on singular value decomposition.The results showed that the F1 value of this method was up to 0.15,and the calculation time was shortened by 69.09%in average.Research inproves the defect that the inherent tags of the‘Like’informations cannot accurately reflect user preference,avoiding the disadvantage of recalculating new users and their‘like’information in the predicting process of traditional methods,providing another feasible way for user traits analysis.
作者 王雅静 郭强 邓春燕 林青轩 刘建国 WANG Yajing;GUO Qiang;DENG Chunyan;LIN Qingxuan;LIU Jianguo(Research Center for Complex Systems Science,University of Shanghai for Science&Technology,Shanghai 200093,China;Institute of Accounting and Finance,Shanghai University of Finance and Economics,Shanghai 200433,China;Institute of Sina WRD Big Data,Shanghai 210204,China)
出处 《复杂系统与复杂性科学》 EI CSCD 2020年第4期9-15,共7页 Complex Systems and Complexity Science
基金 国家自然科学基金(61773248,71771152) 国家社科重大项目(18ZDA088,20ZDA060)。
关键词 用户特征预测 点赞信息 LDA主题模型 奇异值分解 LOGISTIC回归 user traits predicting ‘like’information LDA topic model singular value decomposition Logistic regression
  • 相关文献

参考文献14

二级参考文献192

  • 1代六玲,李雪梅,黄河燕,陈肇雄.基于知识融合的在线文本分类算法——语义SVM[J].华南理工大学学报(自然科学版),2004,32(z1):67-72. 被引量:2
  • 2王奕首,滕弘飞.Knowledge Fusion Design Method:Satellite Module Layout[J].Chinese Journal of Aeronautics,2009,22(1):32-42. 被引量:8
  • 3韩立岩,周芳.基于D-S证据理论的知识融合及其应用[J].北京航空航天大学学报,2006,32(1):65-68. 被引量:41
  • 4缑锦,杨建刚,蒋云良,陈倩.基于元信息和本体论的知识融合算法[J].计算机辅助设计与图形学学报,2006,18(6):819-823. 被引量:14
  • 5Blei D, Ng A, Jordan M. Latent dirichlet allocation. Journal of Machine Learning Research, 2003, 3:993-1022
  • 6Blei D, Lafferty J. Correlated topic models//Weiss Y, Seholkopf B, Platt J eds. Advances in Neural Information Processing Systems 18. Cambridge, MA: MIT Press, 2006
  • 7Li W, McCallum A. Pachinko allocation: DAG-struetured mixture models of topic correlations//Proceedings of the International Conference on Machine Learning (ICML). Pittsburgh, Pennsylvania, 2006: 577-584
  • 8Xing E, Yan R, Hauptmann A. Mining associated text and images with dual-wing harmoniums//Proceedings of the 21th Annual Conference on Uncertainty in Artificial Intelligence (UAI-05). Edinburgh, Scotland, 2005:633-641
  • 9Li F-F, Perona P. A bayesian hierarchical model for learning natural scene categories//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Washington, DC, USA, 2005: 524-531
  • 10Wei X, Croft W B. LDA-based document models for ad-hoc retrieval/ /Proceedings of the 29th SIGIR Conference. 2006: 178-185

共引文献436

同被引文献39

引证文献4

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部