摘要
为进一步改善个性化推荐系统的推荐效果,通过使用强化学习方法对SVDPP算法进行优化,提出一种新的协同过滤推荐算法。考虑用户评分的时间效应,将推荐问题转化为马尔科夫决策过程。在此基础上,利用Q-learning算法构建融合时间戳信息的用户评分优化模型,同时通过预测评分取整填充和优化边界补全方法预测缺失值,以解决数据稀疏性问题。实验结果显示,该算法的均方根误差较SVDPP算法降低了0.005 6,表明融合时间戳并采用强化学习方法进行推荐性能优化是可行的。
To futher improve the recommendation performance of personalized recommendation systems,this paper proposes a Collaborative Filtering(CF)recommendation algorithm based on SVDPP algorithm optimized by reinforcement learning.Considering the time effect of user ratings,the recommendation problem is transformed into a Markov Decision Process(MDP).On this basis,the Q-learning algorithm is used to construct a user rating optimization model fused with timestamp information.At the same time,in order to solve the data sparse problem,the prediction score is rounded to the nearest integer to fill and optimize the boundary to make up for the missing value in the process of prediction.Experimental results show that the RMSE of this algorithm is0.005 6 lower than that of SVDPP algorithm,which demonstrates that it is feasible to use the reinforcement learning method and timestamp to optimize the recommendation performance.
作者
周运腾
张雪英
李凤莲
刘书昌
焦江丽
田豆
ZHOU Yunteng;ZHANG Xueying;LI Fenglian;LIU Shuchang;JIAO Jiangli;TIAN Dou(School of Information and Computer,Taiyuan University of Technology,Taiyuan 030600,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2021年第2期46-51,共6页
Computer Engineering
基金
山西省重点研发计划(社会发展领域)(201803D31045)
山西省自然科学基金(201801D121138)
山西省科技重大专项(20181102008)。
关键词
协同过滤
奇异值分解
强化学习
马尔科夫决策过程
Q-learning算法
Collaborative Filtering(CF)
Singular Value Decomposition(SVD)
reinforcement learning
Markov Decision Process(MDP)
Q-learning algorithm