摘要
传统的协同过滤算法虽然可以很容易地挖掘出用户的兴趣爱好,但存在数据冷启动和稀疏性问题.针对这些问题,提出一种基于用户兴趣模型的推荐算法.首先通过LDA主题模型训练数据集得到物品-主题概率分布矩阵,利用物品-主题概率分布矩阵得到用户历史兴趣模型,然后结合用户历史行为信息和物品内容信息得到用户兴趣模型,最后计算用户与候选集之间的相似度,进行TOP-N推荐.在豆瓣电影数据集上的实验结果表明,改进后的推荐算法能够更好地处理稀疏数据和冷启动问题,并且明显提高了推荐质量.
Although traditional collaborative filtering recommendation algorithm can easily find potential users' interests,it remains cold-start problem and sparsity problem. In order to solve these problems, a new hybrid recommendation algorithm is proposed. Firstly, this study builds topic distribution matrix through the LDA topic model, and user interest matrix is created using topic distribution matrix. Secondly, the user interest model is obtained by combining user's historical behavior information and user's content information. Finally, the TOP-N recommendation list is output after calculating the similarity of user and candidate movies. Experiments on the Douban Movies dataset reveals that the results obtained from improved recommendation algorithm are obviously better than that from traditional recommendation algorithm, and it can better deal with sparse data and cold-start problems.
作者
于波
杨红立
冷淼
YU Bo;YANG Hong-Li;LENG Miao(University of Chinese Academy of Sciences,Beijing 100049,China;Shenyang Institute of Computing Technology,Chinese Academy of Sciences,Shenyang 110168,China)
出处
《计算机系统应用》
2018年第9期182-187,共6页
Computer Systems & Applications
关键词
协同过滤
用户兴趣模型
LDA主题模型
推荐算法
collaborative filtering
user interest model
LDA topic model
recommendation algorithm