摘要
作为业界应用最成功、最广泛的个性化推荐算法之一,协同过滤算法依然面临着诸多问题与挑战,数据的长尾分布便是其中之一。由于用户行为数据往往存在长尾分布的现象,同时采用Jaccard相似度作为物品相似度计算公式的协同过滤算法往往倾向于为用户推荐热门物品,致使热门的物品越来越热门,冷门的物品越来冷门。在不断给用户推荐物品的过程中,该传统协同过滤算法推荐给用户的物品集合越来越小,可能会渐渐失去个性化推荐的能力,影响用户体验,为降低热门物品对物品相似度计算结果的影响,本文提出一种基于物品流行度的惩罚因子(Penalty Factor)来修正物品相似度公式。通过在三个公开数据集上的实验验证与分析,该改进方法可在一定程度上提高推荐算法发掘新物品的能力。
As one of the most successful and widely used personalized recommendation algorithms in the industry,collaborative filtering algorithm still faces many problems and challenges, and the long-tail distribution of data is one of them. Because of the phenomenon of long-tail distribution of user behavior data, collaborative filtering algorithm using Jaccard similarity as the calculation formula of item similarity tends to recommend popular ones for users. In the process of continuously recommending items to users, the collection of items recommended by traditional collaborative filtering algorithm becomes smaller and smaller, which may gradually lose the ability of personalized recommendation and affect user experience. In order to reduce the impact of hot items on the results of similarity calculation, this paper proposes an item-based approach. Penalty Factor(Penalty Factor) of popularity is used to modify the similarity formula of items. Through experimental verification and analysis on three open datasets, the improved method can improve the ability of recommendation algorithm to discover new items to a certain extent.
作者
尹毫
焦文彬
史广军
何晓涛
Yin Hao;Jiao Wenben;Shi Guangjun;He Xiaotao(Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100190,China)
出处
《科研信息化技术与应用》
2019年第2期10-19,共10页
E-science Technology & Application
基金
中科院网上采购平台应用示范项目(Y82971002401)
关键词
个性化推荐
协同过滤
惩罚因子
物品相似度
覆盖度
personalized recommendation
collaborative filtering
penalty factor
item similarity
coverage