期刊文献+

一种不确定性数据中最大频繁项集挖掘方法 被引量:1

A new algorithm for mining maximal frequent itemsets from uncertain data
下载PDF
导出
摘要 不确定性数据挖掘已经成为数据挖掘领域的新热点,频繁项集挖掘是重点研究的问题之一.但是目前出现的挖掘算法大多集中在完全频繁项集,而用于最大频繁项集和频繁闭项集的算法尚不多见.文中研究了一种基于UF-Tree的用于不确定性数据中挖掘最大频繁项集的算法,该挖掘过程分为两个步骤,第一步先得到以频繁1-项集为后缀的局部最大频繁项集,第二步得到所有的全局最大频繁项集,实验证明该算法性能良好且特别适用于稠密型、事务长度较小的数据集. Recently,the research on uncertain data mining has become a new hotspot in the area of data mining, and the frequent itemsets mining is one of the focus issues. The existing algorithms mostly concentrated on the complete frequent itemsets, and there is few algorithms used to mine maximal or closet ones. This paper proposes a new algorithm UMF-growth to mine maximal fre- quent itemsets from uncertain data. The mining process of the UMF-growth is divided into two steps: the first step is to find out all of the local maximal frequent itemsets with the frequent 1-i- tem as suffixes, respectively. And the second step is to get all the maximal frequent itemsets. The experimental results show that the performance of UMF-growths is very good and especially suitable for the dense database.
出处 《山东理工大学学报(自然科学版)》 CAS 2013年第5期17-21,27,共6页 Journal of Shandong University of Technology:Natural Science Edition
基金 国家自然科学基金资助项目(61163010) 山东省自然科学基金资助项目(ZR2011FL013)
关键词 不确定数据 最大频繁项集 UF—Tree uncertain data maximal frequent itemsets UF-Tree
  • 相关文献

参考文献9

  • 1Leung C K S. Mining uncertain data[J]. Wiley Interdisciplinary Reviews : Data Mining and Knowledge Discovery, 2011, 1 (4) 316-329.
  • 2Chui C K, Kao B, Hung E. Mining frequent itemsets from un- certain data[M]. Advances in knowledge discovery and data mining. Heidelberg : Springer Berlin, 2007 : 47-58.
  • 3Leung C K S, Mateo M A F, Brajczuk D A. A tree-based ap- proach for frequent pattern mining from uncertain data[M]. Advances in Knowledge Discovery and Data Mining. Heidel- berg :Springer Berlin, 2008: 653-661.
  • 4Leung C K, Brajczuk D A. Efficient algorithms for the mining of constrained frequent patterns from uncertain data[J]. SIGKDD Explorations, 2010, 11(2) :123-130.
  • 5Cuzzocrea A, Leung C K. Distributed mining of constrained fre- quent sets from uncertain data[C]//Algorithms and Architec- tures for Parallel Processing. Springer Berlin Heidelberg, 2011: 40-53.
  • 6Leung C K S, Sun L. Equivalence class transformation based mining of frequent itemsets from uncertain data[C]// Proceed-ings of the 2011 ACM Symposium on Applied Computing. ACM, 2011.. 983-984.
  • 7汪金苗,张龙波,邓齐志,王凤英,王勇.不确定数据频繁项集挖掘方法综述[J].计算机工程与应用,2011,47(20):121-125. 被引量:19
  • 8刘卫明,杨健,毛伊敏.基于约束的不确定数据频繁项集挖掘算法研究[J].计算机应用研究,2012,29(10):3669-3671. 被引量:2
  • 9王爽,杨广明,朱志良.基于不确定数据的频繁项查询算法[J].东北大学学报(自然科学版),2011,32(3):344-347. 被引量:10

二级参考文献36

  • 1Vitter J S. Random sampling with a reservoir [ J]. ACM Transactions on Mathematical Software, 1985, 11 ( 1 ) : 37 - 57.
  • 2Gibbons P, Matias Y. New sampling-based summary statistics for improving approximate query answers[ C] //Proceedings of ACM SIGMOD International Conference on Management of Data. Washington D C, 1998:331 - 342.
  • 3Estan C, Varghese G. New directions in traffic measurement and accounting: focusing on the elephants, ignoring the mice [J]. ACM Transactions on Computer Systems, 2003,21(3) : 270 - 313.
  • 4Abiteboul S, Kanellakls P, Grahne G. On the representation and querying of sets of txxssible worlds[J]. ACM SIGMOD Record, 1987,16 (3) : 34 - 48.
  • 5Green T J, Tannen V. Models for incomplete and probabilistic information[J ]. IEEE Data Engineering Bulletin, 2006,29 (1):17-24.
  • 6C.ormode G, Garofalakis M. Sketehing probabilistie data streoans [ C ] // Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. Beijing: ACM Press, 2007:281 - 292.
  • 7Qin Z, Feifei L, Ke Y. Finding frequent items in probabilistie data [ C]// Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. Vancouve: ACM Press, 2008:819 - 832.
  • 8李雪,江贺.不确定数据挖掘技术研究进展[J].中国科技论文在线,2009.
  • 9Agrawal R, Srikant R.Fast algorithms for mining association rules in large databases[C]//Proc of 20th 1CDE, 1997:487-499.
  • 10Agrawal R, Srikant R.Mining sequential patterns[C]//Proc of the llth ICDE, 1995:3-14.

共引文献24

同被引文献15

  • 1颜跃进,李舟军,陈火旺.基于FP-Tree有效挖掘最大频繁项集[J].软件学报,2005,16(2):215-222. 被引量:68
  • 2Tong Yongxin,Chen Lei,Yu P S.UFIMT:An uncertainfrequent itemset mining toolbox[C].Proceedings of the18th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining(KDD),2012:1508-1511.
  • 3Tong Yongxin,Chen Lei,Ding Bolin.Discovering thresholdbasedfrequent closed itemsets over probabilistic data[C].Proceedings of the IEEE 28th International Conferenceon Data Engineering(ICDE),2012:270-281.
  • 4Gao Feng,Wu Chengrong.Mining frequent itemset fromuncertain data[C].Proceedings of the International Conferenceon Electrical and Control Engineering(ICECE),2011:2329-2333.
  • 5Leung C K S,MacKinnon R K,Tanbeer S K.Fast algorithmsfor frequent itemset mining from uncertain data[C].Proceedings of the IEEE International Conference onData Mining(ICDM),2014:893-898.
  • 6He Yanshan,Yue Min.Parallel frequent itemset miningon streaming data[C].Proceedings of the 10th IEEE InternationalConference on Natural Computation(ICNC),2014:725-730.
  • 7Roberto J,Bayardo J.Efficiently mining long patterns fromdatabases[C].Proceedings of the 1998 ACM SIGMODInternational Conference on Management of Data,1998:85-93.
  • 8Grahne G,Zhu J F.High performance mining of maximalfrequent itemsets[C].Proceedings of the 6th SIAMInternational Workshop on High Performance,2003:135-143.
  • 9Leung C K S,Hao B.Mining of frequent itemsets from streams of uncertain data[C].Proceedings of the 25th International Conference on Data Engineering(ICDE),2009:1663-1670.
  • 10Leung C K S,Cuzzocrea A,Jiang F.Discovering frequentpatterns from uncertain data streams with time-fadingand landmark models[M].Transactions on Large-ScaleData and Knowledge Centered Systems VIII.BerlinHeidelberg:Springer,2013:174-196.

引证文献1

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部