摘要
近年来,深度强化学习在推荐系统中的应用受到了越来越多的关注。在已有研究的基础上提出了一种新的推荐模型RP-Dueling,该模型在深度强化学习Dueling-DQN的基础上加入了遗憾探索机制,使算法根据训练程度自适应地动态调整“探索-利用”占比。该算法实现了在拥有大规模状态空间的推荐系统中捕捉用户动态兴趣和对动作空间的充分探索。在多个数据集上进行测试,所提算法在MAE和RMSE两个评价指标上的最优平均结果分别达到了0.16和0.43,比目前的最优研究结果分别降低了0.48和0.56,实验结果表明所提模型优于目前已有的传统推荐模型和基于深度强化学习的推荐模型。
In recent years,the application of deep reinforcement learning in recommendation system has attracted much attention.Based on the existing research,this paper proposes a new recommendation model RP-Dueling,which is based on the deep reinforcement learning Dueling-DQN algorithm,and adds the regret exploration mechanism to make the algorithm adaptively and dynamically adjust the proportion of“exploration-utilization”according to the training degree.The algorithm can capture users’dynamic interest and fully explore the action space in the recommendation system with large-scale state space.By testing the proposed algorithm model on multiple data sets,the optimal average results of MAE and RMSE are 0.16 and 0.43 respectively,which are 0.48 and 0.56 higher than the current optimal research results.Experimental results show that the proposed model is superior to the existing traditional recommendation model and recommendation model based on deep reinforcement learning.
作者
洪志理
赖俊
曹雷
陈希亮
徐志雄
HONG Zhi-li;LAI Jun;CAO Lei;CHEN Xi-liang;XU Zhi-xiong(Command&Control Engineering College,Army Engineering University of PLA,Nanjing 210007,China)
出处
《计算机科学》
CSCD
北大核心
2022年第6期149-157,共9页
Computer Science