摘要
针对基于用户打分的传统协同过滤推荐算法存在准确率较低以及计算延时的问题,提出了一种基于标签与协同过滤的并行混合推荐算法。该算法通过计算标签的词频-逆文档频率(TF-IDF)值降低流行标签的权重,根据用户的历史行为预测用户对其他资源的偏好值,最后依据预测偏好值排序产生Top-N推荐结果。对该算法的计算效率与复杂度进行了理论分析,并且通过并行编程模型MapReduce使其得到了实现,最后在实验中进行了它与Apache软件基金会项目Mahout的协同过滤算法的对比分析。实验结果表明该算法有较高的准确性,能有效地提高推荐效率。
The study focused attention on the problems of lower precision and computing latency of traditional collabora- tive filtering recommendation algorithms, and proposed a parallel hybrid recommendation algorithm based on tagging and collaborative filtering. The algorithm reduces the weight of prevalent tags by calculating the TF-IDF ( time fre- quency-inverse document frequency) value of tags on predicts user preference based on the user historical behav- iors, and finally recommends the Top-N of the predictions. The algorithm' s computation efficiency and complexity were theoretically analyzed, and it was implemented by using the parallel programing model of MapRedce. The ana- lytical comparison of the algorithm with the collaborative filtering algorithm applied to the Mahout, an item of the A- pache Software Foundation, was conducted, and the result showed its higher accuracy, so it can effectively improve the recommendation efficiency.
出处
《高技术通讯》
CAS
CSCD
北大核心
2015年第3期307-312,共6页
Chinese High Technology Letters
基金
国家自然科学基金(61402023)
北京市自然科学基金(4132025)
北京市教师队伍建设青年英才计划(YETP1448)资助项目