摘要
随着不断扩张的数据量,传统推荐系统面临着计算效率低、实时推荐速度较慢、推荐效果不够理想等情况。针对上述问题,使用新一代流式计算引擎ApacheFlink作为推荐的计算平台,结合Hadoop、Hive、Redis、ZooKeeper和Kafka等大数据开源技术构建分布式推荐系统。同时,使用Alink提高离线推荐算法在分布式场景的效率;改进实时推荐算法,利用用户最近历史评分,融入时间衰减函数,生成TOP-N实时推荐列表。结果表明,推荐结果的准确率、召回率以及归一化折损累计增益等指标都有较好地提升,改进后算法有更好的推荐效果。
With the ever-expanding data volume, the traditional recommendation system faces low computational efficiency,slow real-time recommendation speed, and less-than-ideal recommendation effect. To address the aforesaid problems, we use Apache Flink, a new generation of streaming computing engine, as the computing platform for recommendation, and combine with big data open source technologies such as Hadoop, Hive, Redis, ZooKeeper and Kafka to build a distributed recommendation system.Meanwhile, Alink is used to improve the efficiency of the offline recommendation algorithm in distributed scenarios;the real-time recommendation algorithm is improved to generate the TOP-N real-time recommendation list by using users’ recent historical ratings and incorporating the time decay function. The results show that the accuracy, recall and normalized discounted cumulative gain of recommendation results are better improved, and the improved algorithm has better recommendation effect..
作者
郑江文
赵超
ZHENG Jiangwen;ZHAO Chao(School of Information and Electrical Engineering,Hebei University of Engineering,Handan Hebei 056038,China)
出处
《信息与电脑》
2022年第19期108-112,共5页
Information & Computer