Exploit latent Dirichlet allocation for collaborative filtering

Exploit latent Dirichlet allocation for collaborative filtering

导出

摘要 Previous work on the one-class collaborative filtering （OCCF） problem can be roughly categorized into pointwise methods, pairwise methods, and content-based methods. A fundamental assumption of these approaches is that all missing values in the user-item rating matrix are considered negative. However, this assumption may not hold because the missing values may contain negative and positive examples. For example, a user who fails to give positive feedback about an item may not necessarily dislike it; he may simply be unfamiliar with it. Meanwhile, content-based methods, e.g. collaborative topic regression （CTR）, usually require textual content information of the items, and thus their applicability is largely limited when the text information is not available. In this paper, we propose to apply the latent Dirichlet allocation （LDA） model on OCCF to address the above-mentioned problems. The basic idea of this approach is that items are regarded as words, users are considered as documents, and the user-item feedback matrix constitutes the corpus. Our model drops the strong assumption that missing values are all negative and only utilizes the observed data to predict a user＇s interest. Additionally, the proposed model does not need content information of the items. Experimental results indicate that the proposed method outperforms previous methods on various ranking-oriented evaluation metrics.We further combine this method with a matrix factorizationbased method to tackle the multi-class collaborative filtering （MCCF） problem, which also achieves better performance on predicting user ratings. Previous work on the one-class collaborative filtering （OCCF） problem can be roughly categorized into pointwise methods, pairwise methods, and content-based methods. A fundamental assumption of these approaches is that all missing values in the user-item rating matrix are considered negative. However, this assumption may not hold because the missing values may contain negative and positive examples. For example, a user who fails to give positive feedback about an item may not necessarily dislike it; he may simply be unfamiliar with it. Meanwhile, content-based methods, e.g. collaborative topic regression （CTR）, usually require textual content information of the items, and thus their applicability is largely limited when the text information is not available. In this paper, we propose to apply the latent Dirichlet allocation （LDA） model on OCCF to address the above-mentioned problems. The basic idea of this approach is that items are regarded as words, users are considered as documents, and the user-item feedback matrix constitutes the corpus. Our model drops the strong assumption that missing values are all negative and only utilizes the observed data to predict a user＇s interest. Additionally, the proposed model does not need content information of the items. Experimental results indicate that the proposed method outperforms previous methods on various ranking-oriented evaluation metrics.We further combine this method with a matrix factorizationbased method to tackle the multi-class collaborative filtering （MCCF） problem, which also achieves better performance on predicting user ratings.

作者 Zhoujun LI Haijun ZHANG Senzhang WANG Feiran HUANG Zhenping LI Jianshe ZHOU

机构地区 State Key Laboratory of Software Development Environment School of Information College of Computer Science and Technology Collaborative Innovation Center of Novel Software Technology and Industrialization Beijing Advanced Innovation Center for Imaging Technology

出处《Frontiers of Computer Science》 SCIE EI CSCD 2018年第3期571-581,共11页 中国计算机科学前沿（英文版）

基金 We greatly appreciate Weike Pan for his codes of algorithm GBPR[1], which makes us able to evaluate the algorithm more efficiently and more fairly. This work was supported by the National Natural Science Foundation of China （NSFC）（Grant Nos. 61370126, 61672081, 71540028, 61571052, 61602237）, National High-tech R＆D Program of China （2015AA016004）, Beijing Advanced Innovation Center for Imaging Technology （BAICIT-2016001）, the Fund of the State Key Laboratory of Software Development Environment （SKLSDE-2013ZX-19）, the Fund of Beijing Social Science （14JGC103）, the Statistics Research Project of National Bureau （2013LY055）, and the Fund of Beijing Wuzi University, China （GJB20141002）.

关键词 latent Dirichlet allocation one-class collaborative filtering multi-class collaborative filtering latent Dirichlet allocation one-class collaborative filtering multi-class collaborative filtering

分类号 TP13 [自动化与计算机技术—控制理论与控制工程] O156.4 [理学—基础数学]

引文网络
相关文献

1周超,孙英华,熊化峰,刘雪庆.基于用户和项目双向聚类的协同过滤推荐算法[J].青岛大学学报（自然科学版）,2018,31(1):55-60. 被引量：4
2张新培.中国研究型大学智库的发展现状研究——基于内容分析和专家访谈的调查[J].国内高等教育教学研究动态,2018,0(9):5-5.
3龚松杰,丁佩芬,文世挺.电子商务中隐空间多源迁移协同过滤[J].计算机应用研究,2018,35(3):711-716. 被引量：1
4李楚桐,莫赞.基于协同过滤算法的推荐系统研究[J].信息通信,2018,31(2):38-39. 被引量：4
5黄进,邓皓,汪纪锋.直流直线电机自适应状态空间极点配置控制[J].微特电机,2017,45(10):65-69.
6王启航,许锋,罗雄麟.基于PID对角优势补偿阵的过程多变量控制系统设计[J].化工学报,2018,69(3):1092-1101. 被引量：1
7许锋,王启航,罗雄麟.基于常数对角优势补偿阵的多变量控制系统逆Nyquist阵列设计[J].化工学报,2018,69(3):1102-1113.
8ZHAO Yong-sheng,SHE Xiao-he,HE Yan-ping,YANG Jian-min,PENG Tao,KOU Yu-feng.Experimental Study on New Multi-Column Tension-Leg-Type Floating Wind Turbine[J].China Ocean Engineering,2018,32(2):123-131. 被引量：4
9YANG Dongsheng,HU Shaohai,LIU Shuaiqi,MA Xiaole,SUN Yuchao.Multi-focus image fusion based on block matching in 3D transform domain[J].Journal of Systems Engineering and Electronics,2018,29(2):415-428. 被引量：5
10Ning An,Yuanlin Zheng,Huaijin Ren,Xiaohui Zhao,Xuewei Deng,Xianfeng Chen.Normal, degenerated, and anomalous-dispersion-like Cerenkov sum-frequency generation in one nonlinear medium[J].Photonics Research,2015,3(4):106-109. 被引量：2

Frontiers of Computer Science

2018年第3期

浏览历史

内容加载中请稍等...

Exploit latent Dirichlet allocation for collaborative filtering

相关作者

相关机构

相关主题

浏览历史