摘要
推荐系统致力于从海量数据中为用户寻找并自动推荐有价值的信息和服务,可有效解决信息过载问题,成为大数据时代一种重要的信息技术。但推荐系统的数据稀疏性、冷启动和可解释性等问题,仍是制约推荐系统广泛应用的关键技术难点。强化学习是一种交互学习技术,该方法通过与用户交互并获得反馈来实时捕捉其兴趣漂移,从而动态地建模用户偏好,可以较好地解决传统推荐系统面临的经典关键问题。强化学习已成为近年来推荐系统领域的研究热点。文中从综述的角度,首先在简要回顾推荐系统和强化学习的基础上,分析了强化学习对推荐系统的提升思路,对近年来基于强化学习的推荐研究进行了梳理与总结,并分别对传统强化学习推荐和深度强化学习推荐的研究情况进行总结;在此基础上,重点总结了近年来强化学习推荐研究的若干前沿,以及其应用研究情况。最后,对强化学习在推荐系统中应用的未来发展趋势进行分析与展望。
Recommender systems are devoted to find and automatically recommend valuable information and services for users from massive data,which can effectively solve the information overload problem,and become an important information technology in the era of big data.However,the problems of data sparsity,cold start,and interpretability are still the key technical difficulties that limit the wide application of the recommender systems.Reinforcement learning is an interactive learning technique,which can dynamically model user preferences by interacting with users and obtaining feedback to capture their interest drift in real time,and can better solve the classical key issues faced by traditional recommender systems.Nowadays,reinforcement lear-ning has become a hot research topic in the field of recommendation systems.From the perspective of survey,this paper first analyzes the improvement ideas of reinforcement learning for recommender systems based on a brief review of recommender systems and reinforcement learning.Then,the paper makes a general overview and summary of reinforcement learning based recommender systems in recent years,and further summarizes the research situation of traditional reinforcement learning based recommendation and deep reinforcement learning based recommendation respectively.Furthermore,the paper summarizes the frontiers of reinforcement learning based recommendation research topic in recent years and its application.Finally,the future development trend and application of reinforcement learning in recommender systems are analyzed.
作者
余力
杜启翰
岳博妍
向君瑶
徐冠宇
冷友方
YU Li;DU Qi-han;YUE Bo-yan;XIANG Jun-yao;XU Guan-yu;LENG You-fang(School of Information,Renmin University of China,Beijing 100872,China;XUTELI School,Beijing Institute of Technology,Beijing 100081,China)
出处
《计算机科学》
CSCD
北大核心
2021年第10期1-18,共18页
Computer Science
基金
国家自然科学基金(71271209)
中国人民大学研究基金(2020030228)。
关键词
推荐系统
强化学习
深度强化学习
马尔可夫决策过程
多臂老虎机
Recommender systems
Reinforcement learning
Deep reinforcement learning
Markov decision process
Multiple arm bandits