摘要
随着数据的爆炸式增长,传统的算法已不能适应大数据挖掘的需要,需要分布式、并行的关联规则挖掘算法来解决上述问题。MapReduce是一种流行的分布式并行计算模型,因其使用简单、伸缩性好、自动负载均衡和自动容错等优点,得到了广泛的应用。对已有的基于MapReduce计算模型的并行关联规则挖掘算法进行了分类和综述,对其各自的优缺点和适用范围进行了总结,并对下一步的研究进行了展望。
With the explosive growth of data,traditional algorithms couldn’t meet the needs of the large data mining,it needed distributed parallel algorithm for mining association rules to solve the problem of mining association rules in large data.Map-Reduce was a kind of popular distributed parallel computing model,because of its simple to use,good scalability,the advantages of automatic load balancing and fault tolerance,had been widely used.This paper classified and reviewed the existing parallel algorithm for association rules minging based on MapReduce,summarized their respective advantages and disadvantages and scope of application,and prospected the next research.
作者
肖文
胡娟
周晓峰
Xiao Wen;Hu Juan;Zhou Xiaofeng(Dept.of Electrical Information Engineering,Hohai University Wentian College,Maanshan Anhui 243031,China;School of Computer&Information,Hohai University,Nanjing 210098,China)
出处
《计算机应用研究》
CSCD
北大核心
2018年第1期13-23,共11页
Application Research of Computers
基金
安徽省高校自然科学研究项目(KJ2016A623)