摘要
针对协同过滤算法在海量数据环境个性化推荐应用中存在的低效率问题,结合MapReduce框架特点,设计了一种应用于个性化推荐的基于位置编码的索引树(LB-Tree),创新性地将索引结构应用于个性化推荐。利用聚类资源的差异性存储策略,提升MapReduce任务处理并行性;根据聚类数据分布特征,以质心为圆心对聚类中的数据对象进行同心圆分层,并对每层采用不同长度的二进制编码来表达,将所有数据对象的编码组织成索引树结构,缩短频繁推荐的数据查找路径,达到个性化推荐时利用索引结构快速确定搜索空间的目的。与基于项目的 Top-N推荐算法和基于最近邻的推荐算法(SBNM)相比,LB-Tree所需时间开销增长最慢,准确率最高,验证了方法的有效性和高效性。
Since collaborative filtering recommendation algorithm is inefficient in large data environment, a personalized recommendation algorithm based on location bitcode tree, called LB-Tree, was developed. Combined with the characteristics of the MapReduce framework, a novel approach which applyed the index structure in personalized recommendation processing was proposed. For efficient parallel computing in MapReduce, a novel storage strategy based on the differences between clusters was presented. According to the distribution, each cluster was partitioned into several layers by concentric circles with the same centroid, and each layer was expressed by binary bitcodes with different length. To make the frequently recommended data search path shorter and quickly determine the search space by using the index structure, an index tree was constructed by bitcodes of all the layers. Compared with the Top-N recommendation algorithm and Similarity-Based Neighborhood Method( SBNM), LB-Tree has the highest accuracy with the slowest time-increasing, which verifies the effectiveness and efficiency of LB-Tree.
出处
《计算机应用》
CSCD
北大核心
2016年第2期419-423,427,共6页
journal of Computer Applications
基金
湖北省自然科学基金重点资助项目(2015CFA067)
湖北省教育厅科研项目计划重点项目(D20151001)
武汉市科技攻关计划项目(2013012401010851)~~