期刊文献+

高维数据的频繁封闭模式挖掘算法研究综述 被引量:1

Mining Frequent Closed Patterns for Very High Dimensional Data:A Review
下载PDF
导出
摘要 挖掘频繁模式是数据挖掘领域一个重要且基础的问题。频繁封闭项集挖掘可以提供完全的无冗余的频繁模式。随着生物信息学的兴起,产生了一类具有较多列数的特殊数据集,这种高维数据集对以前的频繁封闭模式挖掘算法提出了新的挑战。对高维数据的频繁封闭模式挖掘算法进行了综述,按照算法的特性对这些算法进行了分类,比较了基于行计数的两类挖掘算法,并对能根据数据子集的特性进行列计数和行计数自动转换的混合计数算法进行了讨论,最后指出了该领域的研究方向。 Mining frequent patterns is a fundamental and essential problem in many data mining applications. Mining frequent closed itemsets provides complete and non-redundant results for frequent pattern analysis. The growth of bioinformatics has resulted in datasets with new characteristics. These datasets typically contain a large number of columns. Such high-dimendional datasets pose a great challenge for existing closed frequent pattern discovery algorithms. This paper presents a survey of the various algorithms for mining frequent closed itemsets in very high dimensional data along with a hierarchy organizing the algorithms by their characteristics. We compare two row enumeration-based algorithms, discuss an algorithm which is designed to automatically switch between feature enumeration and row enumeration during the mining process based on the characteristics of the data subset being considered, and finally point out the research direction in this field.
作者 杨风召
出处 《计算机系统应用》 2011年第11期231-235,共5页 Computer Systems & Applications
基金 国家自然科学基金(71072172) 留学人员科技活动择优资助项目(YFZ302002) 江苏高校优势学科建设工程资助项目
关键词 频繁封闭模式 高维数据 数据挖掘 综述 frequent closed pattern high dimensional data data mining survey
  • 相关文献

参考文献10

  • 1Pasquier N, Bastide Y, Taouil R, Lakhal L. Discoverying frequent closed itemsets for association rules. In: Beeri C, Buneman P, eds. Proc. of the 7th International Conference on Database Theory, LNCS 1540. Heidelberg: Springer Berlin, 1999: 398-416.
  • 2Pei J, Hart J, Mao R. CLOSET: An eficient algorithm for mining frequent closed itemsets. In: Chen W, Naughton JF, Bernstein PA, eds. Proc. 2000 ACM-SIGMOD International Workshop Data Mining and Knowledge Discovery. New York: ACM Press,2000:21-30.
  • 3Burdick D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Georgakopoulos D, Buchmann A, eds. Proc. of the 17th International Conference on Data Engineering. Heidelberg: IEEE Computer Society. 2001:443-452.
  • 4Zaki M, Hsiao C. Charm: An efficient algorithm for closed association rule mining. In: Grossman RL, Hart J, Kumar V, Mannila H, Motwani R, eds. Proc. of 2002 SIAM International Conference Data Mining. Arlington, VA, 2002: 457-473.
  • 5Wang J, Han J, Pei J. Closett:Searching for the best stetegies for mining frequent closed itemsets. In: Getoor L, Senator TE, Domingos P, Faloutsos C, eds. Proc. of 2003 ACM SIGKDD International Conference on Kowledge Discovery and Data Mining. New York: ACM Press, 2003: 236-245.
  • 6Pan F, Cong G, Tung AK. Carpenter: Finding closed patterns in long biological datasets. In: Getoor L, Senator TE, Domingos P, Faloutsos C, eds. Proc. of 2003 ACM SIGKDD International Conference on Kowledge Discovery and Data Mining. New York: ACM Press, 2003: 637-642.
  • 7Cong G, Tung AK, Xu X, et al. FARMER: Finding Interesting rule groups in microarray datasets. In: Weikum G. ed. Proc. of the ACM SIGMOD International Conference on Management of Data 2004. New York: ACM Press, 2004: 143-154.
  • 8Cong G~ Tan K, "Pung AK, et al. Mining top-k covering rule groups for gene expression data. In: Ozcan F, ed. Proc. of the ACM SIGMOD International Conference on Management of Data 2005. New York: ACM Press, 2005: 670-681.
  • 9Liu H, Hart J, Xin D, Shao Z. Mining frequent, patterns from very high dimensional data: A. top-down row enumeration approach. In: Ghosh J, Lambert D, SkiUieorn DB, Srivastava J, eds. Proc. of the Sixth SIAM International Conference on Data. Mining. Bethesda: SIAM, 2006: 20-22.
  • 10Pan F, Tung AK, Cao G, Xu X. COBBLER: Combining column and row enumeration for closed pattern discovery. In: Hatzopoulos M, Manolopoulos Y, eds. Proc. of 2004 International Conference on Scientific and Statistical Database Management. Washington: IEEE Computer Society, 2004: 21-30.

同被引文献13

  • 1GOETHALS B, ZAKI M J. Advances in Frequent Itemset Mining Implementations [J]. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 109-117.
  • 2BASTIDE Y, PASQUIER N, TAOUIL R, et al. Mining Minimal Non Redundant Association Rules Using Frequent Closed Item Sets[C]// Proc. of Computational Logic Conference. Berlin: Springer, 2006: 972-986.
  • 3JIN R, ABU-ATA M, XIANG Y, et al. Effective and Efficient Item Set Pattern Summarization: Regression-Based Approaches [C]. Las Vegas: The 14th ACM Special Interest Group on Knowledge Discovery and Data Mining (ACM SIGKDD), 2008: 399-407.
  • 4POERNOMO A K, GOPALKRISHNAN V. Cp-Summary: A Concise Representation for Browsing Frequent Item Sets [C]. Paris: The 15th ACM Special Interest Group on Knowledge Discovery and Data Mining (ACM SIGKDD), 2009: 687-696.
  • 5XIN D, HAN J, YAN X, et al. Mining Compressed Frequent Pattern Sets [C]. Trondheim: 31st International Conference on Very Large Data Bases (VLDB), 2005: 709-720.
  • 6LIU G, LU H, YU J X. CFP-Tree: A Compact Disk-Based Structure for Storing and Querying Frequent Item Sets [J]. Information Systems, 2007, 32(2): 295-319.
  • 7任薇,周杨.FSM——基于子图同构和结构同构的频繁子图挖掘算法(英文)[J].西南大学学报(自然科学版),2008,30(6):158-163. 被引量:2
  • 8罗爱萍.空间跨层关联规则挖掘算法的研究[J].西南师范大学学报(自然科学版),2009,34(4):68-72. 被引量:5
  • 9郭云峰,张集祥.一种基于压缩前缀树的频繁模式挖掘算法[J].计算机工程与科学,2009,31(12):71-73. 被引量:1
  • 10王仕平,蒋玲,熊江,方刚.一种基于序列数的关联规则挖掘算法[J].西南大学学报(自然科学版),2011,33(3):122-127. 被引量:5

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部