期刊文献+

基于分布式数据挖掘方法的研究与应用 被引量:6

Distributed Data Mining Methods Based on Cloud Computing
下载PDF
导出
摘要 针对关联规则Apriori算法多次重复扫描数据库和产生大量候选频繁项集的缺点,对其进行改进,并在MapReduce模型上得以实现。改进的Apriori算法只需要对整个数据库扫描一次,即可得到所有频繁项集的集合。仿真实验结果表明,随着节点数目的增多,改进算法比原算法执行时间要短,并且这种优势随着节点数目的增加而扩大,说明在异构集群环境下,MapReduce模型的Apriori算法能够提高关联规则挖掘的执行效率。将改进的分布式关联规则算法在分布式教育决策支持系统中应用,通过对实际数据的挖掘,证明了该方法对教育决策的有效性。 According to the disadvantage of association rules Apriori algorithm that repeatedly scanning the database and pro- duce a large number of candidate frequent item sets, an improved algorithm was proposed. It was achieved by MapReduce. Improved Apriori algorithm only needs to scan the entire database once, and then it can get the collection of all frequent item sets. The simulation results show that, with the increase in the number of nodes, the improved algorithm in execution time is less than the original algorithm; and the more increase with the number of nodes the more expand is achieved with this advantage. It explained that in heterogeneous cluster environment, MapReduce of the Apriori algorithm can improve the efficiency of mining asso-ciation rules. The improved algorithm of association rules were applied in distributed educational decision support system, through the actual data mining, it was proved that the method is effective for educational decision - making.
作者 汪丽 张露
出处 《武汉理工大学学报(信息与管理工程版)》 CAS 2013年第1期40-43,共4页 Journal of Wuhan University of Technology:Information & Management Engineering
基金 湖北省教育厅教学研究基金资助项目(2009240)
关键词 分布式数据挖掘 MAPREDUCE模型 关联规则 分布式教育决策支持系统 distributed data mining MapReduce model association rules distributed education decision support systems
  • 相关文献

参考文献8

  • 1FU Y J. Distributed data mining: an overview [ R ]. [ S. 1. ] :IEEE TCDP Newsletter, 2001.
  • 2MARIO C,ANTONIO C,ANDREA P,et al. Distributed data mining on grids : services, tools, and applications [ J]. IEEE Transactions on Systems, Man, and Cyber- netics :Part B, Cybernetics, 2004,34(6) :2451 - 2465.
  • 3KULKARNI U P, YARDI A R. Exploring the capabili- ties of mobile agents in distributed data mining [ C ]// Proceeding of the Tenth International Database Engi- neering & Applications Symposium. India: [ s. n. ], 2006 : 277 - 280.
  • 4MINGSYAN C,JIAWEI H,PHILIP S Y. Data mining: an overview from a database perspective [ J ]. IEEE Transaction on Knowledge and Data Engineering, 1996,8 (6) : 866 - 883.
  • 5MEHHEDK.数据挖掘:概念、模型、方法和算法[M].闪四清,陈茵,程雁,等,译.北京:清华大学出版社,2003:56-121.
  • 6AGRAWAL R, TOMASZ I, ARAN S. Mining associa- tion rules between sets of items in large databases [ C J//Proceedings of the 1993 ACM SIGMOD Inter- national Conference on Management of Data. New York : ACM, 1993:207 - 216.
  • 7钱少华,蔡勇,钱雪忠.基于数组的Apriori算法的改进[J].计算机应用与软件,2006,23(2):111-113. 被引量:16
  • 8DEAN J, GHEMAWAT S. Map reduce: simplified data processing on large clusters [ C ]//OSDI' 04: Sixth Symposium on Operating System Design and Imple- mentation. San Francisco : [ s. n. ] ,2004 : 107 - 113.

二级参考文献4

共引文献15

同被引文献53

引证文献6

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部