一种基于划分的带项目约束的频繁项集挖掘算法被引量：1

Algorithm for mining frequent itemsets with item constraint based on partition

下载PDF

导出

摘要为提高关联规则挖掘算法的效率及其对大型数据集的适应性,提出了基于划分的带项目约束的频繁项集挖掘算法Partition CHS Miner。算法按照约束条件裁减数据集,并采用基于约束的超结构CHS(con-straint-based hyper-structure)存储数据。对大型数据集,先将其划分为多个不相交的数据子集,使子集的大小适合主存,然后在子集上采用基于超结构的带项目约束的挖掘算法挖掘出局部频繁项集,最后合并所有子集中的频繁项集形成全局的带约束的候选项集,计算出全局频繁项集。实验证明了算法的有效性。 To improve the efficiency and adaptablity of the algorithms to mine association rules in a large dataset, an algorithm- Partition _ CHS _ Miner for mining frequent itemsets with item constraints based on partition is proposed to mine frequent itemsets. The constraints are employed to reduce the datasets and CHS（constraint-based hyper-structure） is used to store transactions in the algorithm. For a large dataset, the algorithm first divides it into some disjoint sub-datasets whose size is accommodated in the main memory. Then local frequent itemsets are mined in sub-datasets by using constraint-based hyper-structure mining algorithm. At last, all local frequent itemsets are merged into global candidate itemsets and the global frequent itemsets are calculated based on these global candidate itemsets. The results of experiment show the efficiency of the algorithm.

作者陈慧萍朱峰王建东周小芹

机构地区河海大学计算机信息工程学院南京航空航天大学信息科学与技术学院

出处《系统工程与电子技术》 EI CSCD 北大核心 2006年第7期1082-1086,共5页 Systems Engineering and Electronics

基金国家"973"计划基础研究发展基金资助课题(G1999032701)

关键词数据挖据关联规则频繁项集划分 data mining association rule frequent itemset partition

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献5

1Han J,Kamber M.Data mining:concepts and techniques[M].Beijing:High Education Press,2001.
2Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation:a frequent-pattern tree approach mining frequent patterns without candidate generation[J].Data Mining and Knowledge Discovery,2004 (8):53-87.
3Pei J,Han J,Lu H,et al.H-mine:hyper-structure mining of frequent patterns in large databases[C]// ICDM'01.San Jose,CA:2001:38-49.
4Srikant R,Vu Q,Arawal R.Mining assosiation rules with items constraints[C]// Proc.of the Third Int'l Conf.on Knowledge Discovery in Databases and Data Mining,1997:67-73.
5Savasere A,Omiecinski E,Navathe S.An efficient algorithm for mining association rules in large databases[C]// Proc.of the 21st International Conference on Very Large Database,1995:432-443.

同被引文献11

1宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量：21
2刘学军,徐宏炳,董逸生,王永利,钱江波.挖掘数据流中的频繁模式[J].计算机研究与发展,2005,42(12):2192-2198. 被引量：25
3刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量：26
4Giannella C, Han J, Pei J, et al. Mining frequent patterns in data streams at multiple time granularities [C]//Next Generation Data Mining. Cambridge, Mass: MIT Press, 2003: 191-212.
5Manku G S, Motwani R. Approximate frequency counts over data streams[C]//Proceedings of the 28th International Conference on Very Large Data Bases. Hong Kong: VLDB Endowment, 2002: 346-357.
6Chang J H, Lee W S. Finding recent frequent itemsets adaptively over online data streams[C]// Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington: ACM Press, 2003: 487-492.
7Cormode G, Muthukrishnan S. What's hot and what's not: tracking most frequent items dynamically [J]. ACM Transactions on Database Systems, 2003, 30 (1) : 249-278.
8Arasu A, Manku G S. Approximate counts and quantiles over sliding windows[C]//Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Paris, France: ACM Press, 2004: 286-296.
9Datar M, Gionis A, Indyk P, et al. Maintaining stream statistics over sliding windows [J]. SIAM Journal on Computing, 2002, 31(6): 1 794-1 813.
10Pasquier N, Bastide Y, Taouil R, et al. Discovering frequent closed itemsets for association rules [C]// Proceedings of the 17th International Conference on Database Theory. Berlin: Springer-Verlag, 1999, 1 540: 398-416.

引证文献1

1胡为成,王本年,程转流.基于DSCFCI_tree的带项目约束的数据流频繁闭合模式挖掘算法[J].中国科学技术大学学报,2009,39(11):1194-1201. 被引量：2

二级引证文献2

1杨君锐,杨莉.分布式全局最大频繁项集更新挖掘算法[J].华中科技大学学报（自然科学版）,2011,39(12):85-88. 被引量：2
2杨君锐,张敏,何洪德.基于分布式的频繁闭合模式挖掘算法[J].西南交通大学学报,2012,47(6):1027-1033.

1邱长春.基于项目约束的关联规则挖掘方法的研究[J].湖北教育学院学报,2006,23(8):21-23. 被引量：2
2张艺雪,黄毅杰.一种基于MapReduce的Apriori改进算法研究[J].兰州工业学院学报,2014,21(6):13-16. 被引量：2
3郭进伟,皮建勇.基于MapReduce的SON算法实现[J].计算机应用,2014,34(A01):100-102. 被引量：7
4洪月华.传感器网络分布式数据流的频繁项集挖掘算法[J].计算机科学,2013,40(2):58-60. 被引量：4
5黄毅杰.一种基于Map Reduce的关联规则挖掘算法[J].兰州文理学院学报（自然科学版）,2014,28(5):48-51.
6朱琼,施荣华.一种数据流中的频繁模式挖掘算法[J].计算机应用,2008,28(6):1463-1466. 被引量：3
7何波.基于频繁模式树的分布式关联规则挖掘算法[J].控制与决策,2012,27(4):618-622. 被引量：11
8高飞,谢维信.发现含有第一类项目约束的频繁集的快速算法[J].计算机研究与发展,2001,38(11):1295-1301. 被引量：7
9悲情的鼠标.微软行动：代号“杀毒”[J].中学生电脑,2005(3):11-11.
10曾庆森,黄贤英.基于FP-tree的快速数据挖掘算法[J].重庆工学院学报（自然科学版）,2009,23(10):72-76. 被引量：3

系统工程与电子技术

2006年第7期

浏览历史

内容加载中请稍等...

一种基于划分的带项目约束的频繁项集挖掘算法被引量：1

参考文献5

同被引文献11

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于划分的带项目约束的频繁项集挖掘算法 被引量：1

参考文献5

同被引文献11

引证文献1

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于划分的带项目约束的频繁项集挖掘算法被引量：1