基于Flink的动态感知用户兴趣漂移的电影推荐系统

Movie Recommendation System Dynamically Sensing User's Interests Drift Based on Flink

下载PDF

导出

摘要传统使用Hadoop平台基于协同过滤算法搭建的分布式推荐系统,存在两个亟待解决的问题:(1)在面对海量数据与复杂的推荐算法模型时,处理数据的速度明显下降,不能做到低延时,无法对用户进行实时推荐;(2)传统基于协同过滤的推荐算法,无法实时感知用户兴趣漂移的问题,导致推荐的结果差强人意。针对以上两个问题,引入新一代流式计算引擎Flink,使用Spark、Flume、Kafka等大数据组件搭建电影推荐系统,整个推荐系统的推荐算法部分,分为离线与在线推荐两大模块,离线推荐算法引入堆排序,解决MLlib中ALS算法在模型预测时会进行笛卡尔积计算,消耗大量内存与算法执行时间长的问题;实时推荐算法引入艾宾浩斯遗忘曲线,通过融合时间权重与奖惩因子,来动态地感知用户兴趣发生漂移的问题。通过离线与在线推荐算法的改进,产生更好的个性化Top-N推荐结果,提升最终用户的体验。实验结果表明:(1)通过堆排序改进后的离线推荐ALS算法,在RMSE指标基本不变的情况下执行速率显著提高;(2)通过引入艾宾浩斯遗忘曲线,融合时间权重与奖惩因子的实时推荐算法,在准确率和召回率指标上明显提高,推荐结果更符合用户兴趣爱好;(3)Flink计算引擎相比较Spark计算引擎在数据量不断增加的情况下,算法执行速度更快。 The traditional distributed recommendation system based on the Hadoop platform and the collaborative filtering algorithm has two problems to be solved urgently.First,due to huge amount of data and the complexed models of recommendation algorithm,the speed of data processing for the recommendation system is significantly reduced,and it is impossible to achieve low latency.Too much time is taken,the recommendation system is difficult to achieve real-time recommendation for users.Second,the traditional recommendation algorithm based on the collaborative filtering can not perceive the drift of user's interests in real time,resulting in the unsatisfactory results of recommendation.To solve the above two problems,Flink,a new generation of streaming computing engine,is introduced,and a movie recommendation system is built by adopting big data components such as Spark,Flume,and Kafka.The whole recommendation system consists of two parts i.e.,offline and online recommendation algorithm.For the offline recommendation algorithm,heap sorting is introduced to solve the problem that the ALS algorithm in MLlib will perform Cartesian product calculation during model prediction,consume a lot of memory and take a long time to execute.For the online recommendation algorithm,the Ebbinghaus forgetting curve,which integrates the time weights and reward-punishment factors,is introduced to dynamically perceive the user's interests drift.Through the improvement of offline and online recommendation algorithms,the recommendation system can achieve better personalized Top-N recommendation results,and improve the experience of users.The experimental results demonstrate that the improved offline recommendation algorithm of ALS by heap sorting can significantly improve the execution speed,while the RMSE index is almost unchanged.The improved online recommendation algorithm by introducing the Ebbinghaus forgetting curve,the real-time weight,and the reward-punishment factors can significantly improve the accuracy rate and recall rate indicators.The last recommendation results are more in line with the user's interests.Compared with the Spark computing engine,the Flink computing engine executes faster,when large amount of data need to be processed.

作者李光明杨攀攀古婵 LI Guangming;YANG Panpan;GU Chan(College of Electronic Information and Artificial Intelligence,Shaanxi University of Science and Technology,Xi'an Shaanxi 710021,China;School of Electrical and Control Engineering,Shaanxi University of Science and Technology,Xi'an Shaanxi 710021,China)

机构地区陕西科技大学电子信息与人工智能学院陕西科技大学电气与控制工程学院

出处《电子器件》 CAS 2024年第5期1425-1433,共9页 Chinese Journal of Electron Devices

基金国家自然科学基金项目(62003201)。

关键词 Flink 堆排序艾宾浩斯遗忘曲线时间权重奖惩因子 Flink heap sort Ebbinghaus forgetting curve time weight reward-punishments factors

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

1熊炬,姚晓光.课程思政在临床教学中的实施方法探索[J].中文科技期刊数据库（引文版）教育科学,2024(11):0152-0155.
2林昕,张艳丽,康彦,刘闰豪.Hive数据库在电商销售大数据分析中的应用研究[J].电脑编程技巧与维护,2024(10):99-101.
3Alfred Poor.电网“神经元”大大提升输电线性能[J].科技纵览,2024(8):16-17.
4马永波,彭玉.基于用户本地信息的间接信任模型协同过滤推荐算法[J].电脑知识与技术,2024,20(28):56-58.
5王海,张丽香.基于Web课程学习系统的设计与实现[J].电脑编程技巧与维护,2024(10):40-42.
6陈登祥,赵忠义.以城市生命线工程建设试点为契机探索人防工程特色监管模式[J].中国人民防空,2024(8):38-41.
7臧润泽,陈解决,丁正军.燃煤电厂运行数据价值深度挖掘探析[J].电力设备管理,2024(18):77-79.
8王博,刘洋,张明,吴航,胡鑫.面向OpenAPI的云原生时空信息处理服务[J].工程勘察,2024,52(10):44-49.
9黄山,吴煜凡,吕鹤轩,段晓东.异构微差同步并行训练算法[J].计算机工程与科学,2024,46(11):1949-1959.
10周军,刘星,肖飞,程石磊.具有优异汉明相关特性的混沌跳频序列构造方法[J].通信技术,2024,57(9):892-896.

电子器件

2024年第5期

浏览历史

内容加载中请稍等...

基于Flink的动态感知用户兴趣漂移的电影推荐系统

相关作者

相关机构

相关主题

浏览历史