摘要
协同过滤算法是解决信息超载的关键技术之一,但仍存在预测不准确的问题。因此,在分析了Spark技术及框架并阐述了Slope One算法不足的基础上,针对项目与用户间的相似性提出了一种改进的Slope One算法,并在Spark平台上实现了该算法。实验证明,改进后的Slope One算法具有更高的预测准确性,且在Spark平台上实现了并行化操作,用Speedup和Sizeup方法证明了算法的并行性、扩展性良好,提高了算法的效率。
As one of the key technologies to solve the information overload, the collaborative filtering algorithm exhibits the flaw of inaccuracy prediction. Therefore, based on the analysis of Spark technology as well as its framework and the elaboration of the flaw in Slope One algorithm,an improved Slope One algorithm has thus been proposed for the similarity between projects and users, followed by the implementation of the algorithm on Spark platform. Experimental results show that the improved Slope One algorithm has a higher accuracy of prediction with its paralleled implementation on Spark. The combined methods of Speedup and Sizeup prove that this algorithm is characterized with a good parallel effect and an excellent expansibility, thus helping to promote the efficiency.
作者
黄婕
刘长生
刘程莉
HUANG Jie;LIU Changsheng;LIU Chengli(Department of Aviation Electronic Equipment Maintenance,Airforce Aviation Repair Institute of Technology,Changsha 410124,China;Hunan Key Laboratory of Intelligent Information Perception & Processing Technology,Hunan University of Technology,Zhuzhou Hunan 412007,China;School of Engineering,Computer and Aviation,University of León,León 24071,Spain)
出处
《湖南工业大学学报》
2019年第4期47-53,共7页
Journal of Hunan University of Technology