摘要
为了从海量的网络影视资源中快速准确地向用户推荐其感兴趣的视频,提出了一种Spark平台下基于邻近传播(AP)聚类的智能推荐方案。数据存储采用了分布式文件系统,并在弹性分布式数据集中采用AP聚类进行资源实时推荐,加快聚类速度。此外,将明可夫斯基(Minkowski)相似性度量引入到AP聚类中,替换原有的欧氏距离度量来构建相似度,以提高其聚类精度。在常用电影数据集MovieLens上进行实验验证。结果表明,Spark平台有效提升了聚类计算的效率。同时,相比于AP聚类和K-均值聚类算法,改进AP聚类的推荐准确率更高。
In order to quickly and accurately recommend videos appealing to users from massive network video resources,an intelligent recommendation scheme based on Affinity Propagation(AP)clustering in Spark platform is proposed.The distributed file system is used for data storage,and AP clustering is used in elastic distributed data set to realize real-time resource recommendation,which accelerates the clustering speed.In addition,Minkowski similarity measurement is introduced into AP clustering,replacing the original Euclidean distance measurement to construct similarity matrix to improve its clustering accuracy.Experiment verification is carried out on the commonly used movie data set MovieLens.The results show that spark platform improves the efficiency of clustering computing.At the same time,compared with AP clustering and K-means clustering algorithm,the recommended accuracy of improved AP clustering is higher.
作者
张敏
程鹏翔
ZHANG Min;CHENG Peng-xiang(Film and Television Department,Shaanxi Art Vocational College,Xi’an 710054,China)
出处
《信息技术》
2021年第9期30-33,38,共5页
Information Technology
基金
陕西省职业技术教育学会2019年度职业教育研究课题(SZJYB19-227)。
关键词
智能推荐
邻近传播聚类
Spark架构
分布式数据集
相似性度量
intelligent recommendation
neighbor propagation clustering
Spark architecture
distributed data set
similarity measurement