摘要
项目评分矩阵稀疏性问题严重影响协同过滤算法的推荐准确性。针对评分数据稀疏性问题,提出一种融合语义相似度与矩阵分解算法的评分预测算法(SS-MF)。SS-MF通过构建本体概念的层次结构树计算项目间的语义相似度,依据相似度预测填充稀疏矩阵中部分缺失值,调整语义相似度阈值使得既不破坏原始矩阵的内在属性特征又能够完善矩阵数据信息;以预测后评分矩阵为基础,基于矩阵分解理论进行降维与分解,进一步对稀疏矩阵中缺失值进行预测填充。实验结果表明:算法实现了对稀疏评分矩阵中缺失值的预测,且绝对误差值较正态分布随机数预填充奇异值分解(RN-SVD)算法最大降低了4.62%,较奇异值分解(SVD)算法平均降低了1.73%,算法提高了协同过滤中稀疏矩阵评分预测准确性。
Focused on the data sparsity problem, a rating prediction algorithm based on semantic similarity and matrix factorization( SS-MF) was proposed. An ontology concept hierarchy tree was built to calculate the semantic similarity between items. The items under a high similarity in the matrix could be predicted partially. The algorithm did not destroy the original matrix and improved the sparsity matrix by adjusting the semantic similarity threshold. The missing value was prefilled by matrix decomposition and reduction based on the theory of matrix factorization. The experimental results show that the proposed algorithm can predict the missing value for sparsity matrix. Compared to the matrix factorization algorithm which sparsity matrix is prefilled with random normal values, the proposed algorithm decreased the mean absolute error maximum by 4. 62%, and compared to the singular value decomposition algorithm, it decreased the mean absolute error by 1. 73% averagely. The proposed algorithm improves the accuracy of the rating prediction for the sparse matrix.
出处
《计算机应用》
CSCD
北大核心
2017年第A01期287-291,共5页
journal of Computer Applications
基金
国家自然科学基金资助项目(61640209
61540066)
四川省科技创新苗子工程项目(SCMZ2006012)
贵州省科技计划项目(黔科合人字(2015)13号
黔科合JZ字[2014]2004号
黔科合JZ字[2014]2001号
黔科合重大专项字[2013]6020与[2014]6021
黔科合高G字[2014]4001)
关键词
协同过滤
稀疏矩阵
语义相似度
矩阵分解
评分预测
collaborative filtering
sparsity matrix
semantic similarity
matrix factorization
rating prediction