摘要
传统基于物品的协同过滤算法由于物品相似度矩阵稀疏,推荐准确率不高。针对这一问题,提出一种基于标签和改进杰卡德系数的协同过滤算法,进行电视节目个性化推荐。首先,爬取相关信息对原始数据进行扩充,并利用统计学方法对时间特征进行归一化处理,计算用户偏好系数;然后,统计出现次数较高的类别作为推荐类别标签,并利用改进的杰卡德系数构造标签相似度矩阵;最后,根据推荐类别标签的用户偏好系数计算节目的推荐系数。实验结果表明,基于标签的协同过滤算法可以降低稀疏矩阵对推荐准确率的影响,相比基于物品的协同过滤算法,准确率提高了5%,召回率提高了3.1%。另外,使用改进的杰卡德系数计算相似度,减少了热门标签对推荐系统的影响,进一步将准确率提高了5%,召回率提高了2.3%。
In the era of big data,traditional item-based collaborative filtering algorithms lead to the sparseness of item similarity matrix,and the recommendation accuracy rate is not high.To solve this problem,a label-based collaborative filtering algorithm is proposed.First,this algorithm expands the original data by crawling the relevant information,and uses statistical methods to normalize the time characteristics to calculate the user preference coefficient.Next,it selects those with higher occurrences from all crawled categories as recommended category labels.The category constructs a label similarity matrix using the improved Jaccard coefficients that incorporate the penalty coefficients.Finally,the program recommendation coefficients are calculated according to the user preference coefficients of the recommended category labels.The experimental results show that the label-based collaborative filtering algorithm can reduce the influence of sparse matrix on the recommendation accuracy.Compared with the item-based collaborative filtering algorithm,the accuracy rate increases by 5%and the recall rate increases by 3.1%.In addition,using the improved Jaccard coefficient to calculate the similarity can reduce the influence of hot tags on the recommendation system,and further improve the accuracy rate by 5%and the recall rate by 2.3%on the label-based collaborative filtering algorithm.
作者
齐晶
刘瀛
刘艳霞
胡美振
乐海丰
Qi Jing;Liu Ying;Liu Yanxia;Hu Meizhen;Le Haifeng(Tourism College,Beijing Union University,Beijing 100101,China;College of Urban Rail Transit and Logistics,Beijing Union University,Beijing 100101,China;College of Robotics,Beijing Union University,Beijing 100101,China)
出处
《北京联合大学学报》
CAS
2021年第2期47-52,共6页
Journal of Beijing Union University
基金
北京联合大学人才强校优选-拔尖计划项目(BPHR2020BZ02)
北京市教委科技计划项目(KM202011417004)
北京联合大学科研项目(ZK30202002)。
关键词
协同过滤
标签类别相似度
个性化推荐
惩罚系数
杰卡德系数
Collaborative filtering
Label category similarity
Personalized recommendation
Penalty coefficient
Jaccard coefficient