摘要
【目的】解决基于显式反馈信息的协同过滤算法无法处理数据稀疏性和用户选择偏差影响的问题。【方法】根据看见但未交互的项目表现用户的负面偏好,结合用户活跃度、项目流行度和时间因素综合衡量用户对项目的可见性。引入使用前偏好的概念,构建基于用户时点可见性的加权矩阵分解模型以识别缺失数据中用户不感兴趣的项目,并将其填充为低值。【结果】通过在MovieLens两个数据集的实验表明,经过基于无趣项挖掘与低值填充的数据填充算法(UIMLF)填充后,ItemCF和BiasSVD的推荐精度平均提升2~2.5倍。【局限】仅依据“看见未交互”的项目表现用户负面偏好的经验对使用前偏好建模,可能存在经验偏差。【结论】所提方法能有效缓解数据稀疏性和用户选择偏差的影响,使推荐结果更准确。
[Objective] This paper proposes a new method to improve the collaborative filtering algorithm based on explicit feedbacks, aiming to address data sparsity and user selection bias issues. [Methods] First, we retrieved the negative preferences of users who have seen the items but did not interact with them. Then, we measured the visibility of items along with user activity, item popularity and time factors. Third, we introduced the concept of pre-use preferences to construct a weighted matrix factorization model based on user time point visibility. Finally,we identified items that users were not interested in, and marked them with low values. [Results] We examined our model with the MovieLens datasets, and found the recommendation accuracy of ItemCF and BiasSVD increased by an average of 2 to 2.5 times. [Limitations] There may be empirical bias in modeling pre-use preferences based on the users’ negative preferences from the“seen-but-not-interacted items”. [Conclusions] The proposed model could effectively reduce the impacts of data sparsity and user selection bias, and make accurate recommendation results.
作者
石磊
李树青
Shi Lei;Li Shuqing(College of Information Engineering,Nanjing University of Finance&Economics,Nanjing 210023,China)
出处
《数据分析与知识发现》
CSSCI
CSCD
北大核心
2022年第5期64-76,共13页
Data Analysis and Knowledge Discovery
基金
江苏省高等学校自然科学研究重大项目(项目编号:19KJA510011)
江苏省研究生科研与实践创新计划项目(项目编号:KYCX20_1348)的研究成果之一。
关键词
协同过滤
显式反馈
选择偏差
使用前偏好
无趣项
Collaborative Filtering
Explicit Feedback
Selection Bias
Pre-use Preference
Uninteresting Items