期刊文献+

基于SPARK平台的ALS预测模型实验研究

Experimental Study on ALS Forecasting Model Based on SPARK Platform
下载PDF
导出
摘要 全球每天产生大量的数据,如何快速处理大数据为人们所用是亟待解决的问题。随着大数据和数据挖掘技术的不断发展和成熟,个性化推荐在我们生活中发挥着越来越重要的作用。论文主要研究同一用户对相似电影以及相似用户对同一影片的评分预测,通过寻找最优参数的方法使预测的准确度提高,以便能够有效地向用户推荐其感兴趣的影片。论文利用MovieLens数据集,在Spark平台架构上训练得到最优ALS模型参数,并将模型预测出的评分与均值模型预测评分做对比。实验结果表明该文模型的预测准确度有了明显提高。 The world produces a lot of data every day,how to quickly deal with large data for people to use is an urgent problem to be solved. With the large data and data mining technology continues to develop and mature,personalized recommendation in our lives plays an increasingly important role. This paper focuses on the same user's prediction of the same film for similar films and similar users,and improves the accuracy of the forecast by finding the optimal parameters so that the videos of interest can be effectively recommended to the users. In this paper,the optimal ALS model parameters are trained on the Spark platform using the MovieLens data set,and the predicted scores of the model are compared with the mean model predictions. The experimental results show that the prediction accuracy of the model has been improved obviously.
作者 姜婷婷 杜振军 JIANG Tingting;DU Zhenjun(Dalian Maritime University School,Dalian 116026)
机构地区 大连海事大学
出处 《计算机与数字工程》 2018年第11期2310-2314,共5页 Computer & Digital Engineering
关键词 大数据 Spark平台 ALS模型 评分预测 big data Spark platform ALS model Score forecast
  • 相关文献

参考文献5

二级参考文献41

  • 1Xiaoyuan Su,Taghi M. Khoshgoftaar,Jun Hong.A Survey of Collaborative Filtering Techniques[J]. Advances in Artificial Intelligence . 2009
  • 2DEAN J, GHEMAWAT S. MapReduce: simplified data processing on large clusters[ J]. Communications of the ACM, 2008, 51 ( 1 ) : 107 - 113.
  • 3ZHENG Q. Improving MapReduce fault tolerance in the cloud [ C]// Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. Pis- caraway: IEEE Press, 2010:1-6.
  • 4KO S Y, HOQUE I, CHO B, et al. On availability of intermediate data in cloud computations [ C]// HotOS: Proceedings of the 12th Conference on Hot Topics in Operating Systems. Berkeley: USENIX Association, 2009: 6-6.
  • 5WANG G, BUTT A R, PANDEY P, et al. A simulation approach to evaluating design decisions in MapReduee setups [ C ]// MASCOTS 2009: Proceedings of the 2009 IEEE International Sym- posium on Modeling, Analysis & Simulation of Computer and Tele- communication Systems. Piscataway: IEEE Press, 2009:1 -11.
  • 6DINU F, NG T E. Understanding the effects and implications of compute node related failures in Hadoop [ C]// HPDC 2012: Pro- ceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing. New York: ACM Press, 2012: 187 - 198.
  • 7HU P, DAI W. Enhancing fault tolerance based on Hadoop cluster [J]. International Journal of Database Theory and Application, 2014, 7(1): 37-48.
  • 8RESNICK P, VARIAN H R. Recommender system [J]. Communications of the ACM, 1997, 40(3): 56-58.
  • 9CHU W, PARK S T. Personalized recommendation on dynamic con-tents using predictive bilinear models [C]// WWW 2009: Proceedings of the 2009 18th International Conference on World Wide Web. New York: ACM, 2009: 691-700.
  • 10WANG G, XIE S, LIU B, et al. Review graph based online store review spammer detection [C]// ICDM 2011: Proceedings of the 2011 International Conference on Data Mining. Washington, DC: IEEE Computer Society, 2011: 1242-1247.

共引文献51

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部