摘要
针对关联规则Apriori算法多次重复扫描数据库和产生大量候选频繁项集的缺点,对其进行改进,并在MapReduce模型上得以实现。改进的Apriori算法只需要对整个数据库扫描一次,即可得到所有频繁项集的集合。仿真实验结果表明,随着节点数目的增多,改进算法比原算法执行时间要短,并且这种优势随着节点数目的增加而扩大,说明在异构集群环境下,MapReduce模型的Apriori算法能够提高关联规则挖掘的执行效率。将改进的分布式关联规则算法在分布式教育决策支持系统中应用,通过对实际数据的挖掘,证明了该方法对教育决策的有效性。
According to the disadvantage of association rules Apriori algorithm that repeatedly scanning the database and pro- duce a large number of candidate frequent item sets, an improved algorithm was proposed. It was achieved by MapReduce. Improved Apriori algorithm only needs to scan the entire database once, and then it can get the collection of all frequent item sets. The simulation results show that, with the increase in the number of nodes, the improved algorithm in execution time is less than the original algorithm; and the more increase with the number of nodes the more expand is achieved with this advantage. It explained that in heterogeneous cluster environment, MapReduce of the Apriori algorithm can improve the efficiency of mining asso-ciation rules. The improved algorithm of association rules were applied in distributed educational decision support system, through the actual data mining, it was proved that the method is effective for educational decision - making.
出处
《武汉理工大学学报(信息与管理工程版)》
CAS
2013年第1期40-43,共4页
Journal of Wuhan University of Technology:Information & Management Engineering
基金
湖北省教育厅教学研究基金资助项目(2009240)