摘要
针对协同过滤推荐系统在稀疏数据集条件下推荐准确度低的问题,提出了推荐支持度模型以及用于该模型计算的邻域线性最小二乘拟合的推荐支持度评分算法(linear least squares fitting,LLSF)。该模型描述用户对被推荐项目更感兴趣的可能性,通过用高支持度的评分估计取代传统的期望估计法来找出用户更喜欢的项目,从而提高推荐的准确度,并从理论上论述了该算法在稀疏数据集条件下相对其他算法具有更强的抗干扰能力。该模型还易于与其他推荐模型融合,具有很好的可拓展性。实验结果表明:LLSF算法显著提升了推荐的准确性,在MovieLens数据集上,F1分数可达到传统的kNN算法的3倍多,对于越是稀疏的数据集,准确率提升幅度越大,在Book-Crossing数据集上,当稀疏度由91%增加到99%时,F1分数的改进由22%提高到125%。同时该方法不会牺牲推荐覆盖率,可以保证长尾项目的挖掘效果。
A recommendation-support model and a neighborhood-based linear least squares fitting(LLSF)algorithm for the calculation of recommendation-support rating are proposed to solve the low accuracy problem of collaborative filtering based recommender systems on sparse data sets.The model focuses on the probability of users' more interests on the recommended items,and uses the estimation with high recommendation-support rating to replace the traditional expectation estimation so that users' preferred items are found and the accuracy of recommendation is improved.A theoretical analysis shows that the anti-interference ability of the LLSF algorithm is better than those of other algorithms under the condition of sparse data sets.The model is also expansible by integrating other models.Experimental results show that the LLSF algorithm improves the recommendation accuracy remarkably.The F1 score is 3times of that of the traditional kNN algorithm on the MovieLens data set.The more sparse the data set is,the more the improvement on accuracy obtains.When the sparsity grows from 91% to 99% on the Bookcrossing data set,the improvement on F1 scores increases from 22% to 125%.Moreover,thealgorithm can guarantee the ability of long tail mining without loss of recommendation coverage.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2015年第6期77-83,共7页
Journal of Xi'an Jiaotong University
基金
国家自然科学基金重点资助项目(61233003)
国家自然科学基金资助项目(61262002)
广西省自然科学基金资助项目(2013GXNSFBA019274)
广西省社科规划研究资助项目(13BXW007)
关键词
协同过滤
推荐系统
邻域线性最小二乘拟合
推荐支持度
collaborative filtering
recommender system
neighborhood-based linear least squaresfitting
recommendation-support model