摘要
对快速增长的数据进行挖掘的有效途径之一是采用增量式更新算法,其中最具代表性的是MRFUP算法。该算法的剪枝策略减少了关联规则的计算,但在处理增长快速的数据时效率过低,且频繁计算新增数据。文章以提高海量数据下关联规则增量更新效率为目标,通过扩展能够并行处理关联规则的PFP算法而提出一种基于PFP的关联规则增量更新算法MRPFP。该算法能充分利用云平台强大的存储和并行计算能力。该算法的实验结果表明,MRPFP处理海量数据的效率优于MRFUP算法,更适用于海量数据的关联规则挖掘。
One effective way for the rapidly growing data mining is the incremental updating algorithm,which is represented by the MRFUP algorithm.MRFUP algorithm has a good advantage in the maintenance of association rules with its pruning strategy,but it has low efficiency in the rapidly growing data processing and calculates the new data frequently.In this paper,aiming at improving the efficiency of association rules incremental updating of the massive data,an association rules incremental updating algorithm MRPFP is proposed by extending the parallel processing algorithm of association rules PFP.The algorithm can take advantage of powerful cloud storage and parallel computing capabilities.The experimental results show that MRPFP is more efficient in processing massive data than MRFUP and more suitable for the association rules mining of massive data.
出处
《合肥工业大学学报(自然科学版)》
CAS
CSCD
北大核心
2015年第4期500-503,551,共5页
Journal of Hefei University of Technology:Natural Science