期刊文献+

分布式全局最大频繁项集挖掘算法 被引量:1

A mining algorithm for distributed global maximal frequent itemsets
下载PDF
导出
摘要 提出一种分布式全局最大频繁项集挖掘算法(DMFI),该算法含局部挖掘与全局挖掘2个阶段。提出一个基于FP-tree的改进频繁模式树(IFP-tree)来存储数据信息。在局部挖掘阶段,先在各站点上分别建立该模式树,并使用有序方式存储频繁项目,然后,通过对各局部数据库的扫描,挖掘出局部最大频繁项集。在全局挖掘阶段,利用各局部数据库生成的最大频繁项集以及利用组通信播报消息的方式,从而挖掘出全局最大频繁项集的集合。对算法的实现以及在多种情况下进行测试。研究结果表明:DMFI算法具有较好的性能。 A new algorithm,named distributed maximal frequent itemsets(DMFI) for mining distributed global maximal frequent itemsets from databases was proposed.DMFI has the local mining phase and the global mining phase.A new frequent pattern tree structure,named improved frequent pattern tree(IFP-tree) based on FP-tree,was developed to facilitate the storage.During the local mining phase,DMFI firstly created the tree on each node and used figure sequence to store frequent itemsets,then it discovered the local maximal frequent itemsets after scanning the local databases.During the global mining phase,DMFI was used to share with all nodes in the local maximal frequent itemsets and broadcasted itemsets information for sets communication,so that the global maximal frequent itemsets was mined.DMFI was implemented to evaluate its performance for various cases.The results demonstrate better performance than other algorithms.
出处 《中南大学学报(自然科学版)》 EI CAS CSCD 北大核心 2012年第9期3517-3523,共7页 Journal of Central South University:Science and Technology
基金 陕西省自然科学基金资助项目(2009JM7007)
关键词 数据挖掘 关联规则 分布式挖掘 最大频繁项集 data mining association rules distributed mining maximal frequent itemsets
  • 相关文献

参考文献17

  • 1Bayardo R J. Efficiently mining long patterns from databases[C]//Proceedings of the ACM SIGMOD Int'l Conference on Management of Data. New York: ACM Press,1998:85-93.
  • 2宋余庆,朱玉全,孙志挥,陈耿.基于FP-Tree的最大频繁项目集挖掘及更新算法[J].软件学报,2003,14(9):1586-1592. 被引量:164
  • 3SG J, Chen C. MMFI: An effective algorithm for mining maximal frequent itemsets[C]//Proceedings of the IEEE International Symposium on Information Processing, IEEE CS Press, 2008: 26-30.
  • 4Burdick D, Calimlim M, Gehrke J. Mafia: A maximal frequent itemset algorithm for transactional Database[C]//the 17th International Conference on Data Engineering. Heidelberg: IEEE Computer Society Press, 2001: 443-452.
  • 5Agrawal R, Srikant R. Fast algorithms for mining association rules[C]//Proceedings of the 20th International Conference on Very Large Data Bases. Santigo, San Francisco: Morgan Kaufmann Publishers Inc, 1994: 487-499.
  • 6Han J, Jian P, Yiwen Y. Mining frequent patterns without candidate generation[C]//Proceedings of 2000 ACM SIGMOD Int'l Conference on Management of Data. Dallas: ACM Press, 2000: 1-12.
  • 7Agrawal R, Sharfer J. Parallel Mining of Association Rules[J]. IEEE Transactions on Knowledge and Data Engineering, 1996, 8(6): 962-969.
  • 8Zaiane O R, EI-Hajj M, Lu P. Fast parallel association rule mining without candidacy generation[C]//Proceedings of the 2001 IEEE International Conference on Data Mining. Washington: IEEE Computer Society Press, 2001: 665-668.
  • 9Javed A, Khokhar A. Frequent pattern mining on message passing multiprocessor systems[J]. Distributed and Parallel Databases, 2004, 16(3): 321-334.
  • 10宋宝莉,覃征.分布式全局频繁项目集的快速挖掘方法[J].西安交通大学学报,2006,40(8):923-927. 被引量:11

二级参考文献35

  • 1陆介平,杨明,孙志挥,鞠时光.快速挖掘全局最大频繁项目集[J].软件学报,2005,16(4):553-560. 被引量:27
  • 2赵辉,王黎明.一个基于网格服务的分布式关联规则挖掘算法[J].小型微型计算机系统,2006,27(8):1544-1548. 被引量:9
  • 3Han J, Kamber M. Data Mining: Concepts and Techniques. Beijing: High Education Press, 2001.
  • 4Agrawal R, ImielinSki T, Swami A. Mining association rules between sets of items in large database. In: Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. Vol 2, Washington DC: SIGMOD, 1993. 207-216.
  • 5Agrawal, R Srikant. Fast algorithms for mining association rules. In: Proc. of the 20th Int'l Conf. Very Large Data Bases(VLDB'94). 1994.487-499.
  • 6Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: Proc. of the 2000 ACM-SIGMOD Int'l Conf. on Management of Data. Dallas: ACM Press, 2000. 1-12.
  • 7Bayardo RJ. Efficiently mining long patterns from databases. In: Haas LM, Tiwary A, eds. Proc. of the ACM SIGMOD Int'l Conf.on Management of Data. New York: ACM Press, 1998.85-93.
  • 8Lin D, Kedem ZM. Pincer-Search: A new algorithm for discovering the maximum frequent set. In: Proc. of the 6th European Conf.on Extending Database Technology. Heidelberg: Springer-Verlag, 1998. 105-119.
  • 9Park JS, Chen MS, Yu PS. Efficient parallel data mining for association rules. In: Proc. of the 4th Int'l Conf. on Information and Knowledge Management. 1995. 31-36.
  • 10Agrawal R, Shafer J. Parallel mining of association rules. IEEE Trans. on Knowledge and Data Engineering, 1996,8(6):962-969.

共引文献194

同被引文献21

  • 1于红,王秀坤,孟军.用有序FP-tree挖掘最大频繁项集[J].控制与决策,2007,22(5):520-524. 被引量:7
  • 2Agrawal R,Imielinske T,Swami A.Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data . 1993
  • 3Liu S H,Liu S J,Chen S X,et al.IOMRA-A high efficiency frequent itemset mining algorithm based on the Map Reduce com putation model. Proceedings-17th IEEE International Conferen ce on Computational Science and Engineering . 2015
  • 4Gunopulos D,Mannila H,Saluja S.Discovering all most spe cific sentence by randomized algorithms. 6th International C onference in Database Theory . 1997
  • 5Lin DI,Kedem ZM.Pincer-Search: A new algorithm for discovering the maximum frequent set. Proceedings of the 6th European Conference on Extending Database Technology . 1998
  • 6Roberto J Bayardo Jr.Efficiently mining long patterns from databases. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data . 1998
  • 7Jiawei Han,Jian Pei,Yiwen Yin et al.Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery . 2004
  • 8Han Jiawei,Kanber Micheline.Data Mining: Concepts and Techniques. . 2001
  • 9Burdick D,Calimlim M,Gehrke J.Mafia: A maximal frequent itemset algorithm for transactional databases. Proceedings of 17th International Conference on Data Engineering . 2001
  • 10MAO Jianxu,Mao Jianpin,Yao Xiaoling ,et al.Mining Frequent Itemsets Based on Equivalent Classes in Large Databases. The Journal of New Industrialization . 2011

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部