期刊文献+

基于MapReduce的分块压缩矩阵Apriori的并行化研究

Parallel Study on the Apriori Algorithm of Compression Matrix Based on MapReduce Programming Mode
下载PDF
导出
摘要 针对经典的Apriori算法需要多次扫描数据库,不适合大规模数据这个问题,提出了一种改进的Apriori算法.该算法采用布尔向量关系运算思想,将事务数据库扫描后转化成压缩矩阵,在MapReduce框架下将压缩矩阵进行分块,每块分别被做并列式处理.利用分压缩矩阵快速计算所有的候选项集,从中产生频繁K-项集,降低了Apriori算法的时间复杂度. In view of the problem of the classic Apriori algorithm need to scan the database re-peatedly and it is not suitable for large-scale data, in this paper, an improved Apriori algorithm was proposed, which used the relationship operation of the Boolean vector, and transformed the transaction database after scanning into a compression matrix. Under the MapReduce frame-work, the compression matrix was divided into blocks for distributed processing. Sub-com-pression matrix was used to do fast calculation for all candidate sets, and the frequent K sets had been generated from all of above, finally, the time complexity of Apriori algorithm was reduced.
出处 《西安文理学院学报(自然科学版)》 2015年第4期26-30,共5页 Journal of Xi’an University(Natural Science Edition)
基金 福建省自然科学基金项目(2015J01660) 宁德师范学院服务海西资助项目(2012H405) 福建省大学生创新创业训练计划项目(201410398059)
关键词 关联规则 MAPREDUCE 压缩矩阵 APRIORI association rules MapReduce compression matrix Apriori
  • 相关文献

参考文献2

  • 1Agrawal R,Srikant R.Fast algorithms for mining association rules[].Proceedings of the th International Conference on Very Large Data Bases.1994
  • 2Agrawal R,Imielinski T,Wami A S.Mining Association Rules Between Sets of Items in Large Databases[].Proc of the ACM SIGM OD Conference on Management of Data.1993

共引文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部