期刊文献+

一种基于划分的带项目约束的频繁项集挖掘算法 被引量:1

Algorithm for mining frequent itemsets with item constraint based on partition
下载PDF
导出
摘要 为提高关联规则挖掘算法的效率及其对大型数据集的适应性,提出了基于划分的带项目约束的频繁项集挖掘算法Partition CHS Miner。算法按照约束条件裁减数据集,并采用基于约束的超结构CHS(con-straint-based hyper-structure)存储数据。对大型数据集,先将其划分为多个不相交的数据子集,使子集的大小适合主存,然后在子集上采用基于超结构的带项目约束的挖掘算法挖掘出局部频繁项集,最后合并所有子集中的频繁项集形成全局的带约束的候选项集,计算出全局频繁项集。实验证明了算法的有效性。 To improve the efficiency and adaptablity of the algorithms to mine association rules in a large dataset, an algorithm- Partition _ CHS _ Miner for mining frequent itemsets with item constraints based on partition is proposed to mine frequent itemsets. The constraints are employed to reduce the datasets and CHS(constraint-based hyper-structure) is used to store transactions in the algorithm. For a large dataset, the algorithm first divides it into some disjoint sub-datasets whose size is accommodated in the main memory. Then local frequent itemsets are mined in sub-datasets by using constraint-based hyper-structure mining algorithm. At last, all local frequent itemsets are merged into global candidate itemsets and the global frequent itemsets are calculated based on these global candidate itemsets. The results of experiment show the efficiency of the algorithm.
出处 《系统工程与电子技术》 EI CSCD 北大核心 2006年第7期1082-1086,共5页 Systems Engineering and Electronics
基金 国家"973"计划基础研究发展基金资助课题(G1999032701)
关键词 数据挖据 关联规则 频繁项集 划分 data mining association rule frequent itemset partition
  • 相关文献

参考文献5

  • 1Han J,Kamber M.Data mining:concepts and techniques[M].Beijing:High Education Press,2001.
  • 2Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation:a frequent-pattern tree approach mining frequent patterns without candidate generation[J].Data Mining and Knowledge Discovery,2004 (8):53-87.
  • 3Pei J,Han J,Lu H,et al.H-mine:hyper-structure mining of frequent patterns in large databases[C]// ICDM'01.San Jose,CA:2001:38-49.
  • 4Srikant R,Vu Q,Arawal R.Mining assosiation rules with items constraints[C]// Proc.of the Third Int'l Conf.on Knowledge Discovery in Databases and Data Mining,1997:67-73.
  • 5Savasere A,Omiecinski E,Navathe S.An efficient algorithm for mining association rules in large databases[C]// Proc.of the 21st International Conference on Very Large Database,1995:432-443.

同被引文献11

  • 1宋余庆,朱玉全,孙志挥,杨鹤标.一种基于频繁模式树的约束最大频繁项目集挖掘及其更新算法[J].计算机研究与发展,2005,42(5):777-783. 被引量:21
  • 2刘学军,徐宏炳,董逸生,王永利,钱江波.挖掘数据流中的频繁模式[J].计算机研究与发展,2005,42(12):2192-2198. 被引量:25
  • 3刘学军,徐宏炳,董逸生,钱江波,王永利.基于滑动窗口的数据流闭合频繁模式的挖掘[J].计算机研究与发展,2006,43(10):1738-1743. 被引量:26
  • 4Giannella C, Han J, Pei J, et al. Mining frequent patterns in data streams at multiple time granularities [C]//Next Generation Data Mining. Cambridge, Mass: MIT Press, 2003: 191-212.
  • 5Manku G S, Motwani R. Approximate frequency counts over data streams[C]//Proceedings of the 28th International Conference on Very Large Data Bases. Hong Kong: VLDB Endowment, 2002: 346-357.
  • 6Chang J H, Lee W S. Finding recent frequent itemsets adaptively over online data streams[C]// Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington: ACM Press, 2003: 487-492.
  • 7Cormode G, Muthukrishnan S. What's hot and what's not: tracking most frequent items dynamically [J]. ACM Transactions on Database Systems, 2003, 30 (1) : 249-278.
  • 8Arasu A, Manku G S. Approximate counts and quantiles over sliding windows[C]//Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Paris, France: ACM Press, 2004: 286-296.
  • 9Datar M, Gionis A, Indyk P, et al. Maintaining stream statistics over sliding windows [J]. SIAM Journal on Computing, 2002, 31(6): 1 794-1 813.
  • 10Pasquier N, Bastide Y, Taouil R, et al. Discovering frequent closed itemsets for association rules [C]// Proceedings of the 17th International Conference on Database Theory. Berlin: Springer-Verlag, 1999, 1 540: 398-416.

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部