期刊文献+

长生物数据集中频繁闭合模式挖掘算法研究 被引量:1

Research of Frequent Closed Pattern Mining in Long Biological Datasets
下载PDF
导出
摘要 传统频繁项集挖掘算法在处理稠密或长数据集(如基因表达数据集)时效率低且产生大量冗余模式,为解决这些问题一些学者提出了闭合模式的概念和挖掘闭合模式的算法,研究证明挖掘闭合模式可以显著减少项集数量并消除大量冗余模式。该文针对生物数据特点提出了一个新颖的挖掘频繁闭合模式的算法REMFOR,该算法在闭合模式概念和行枚举思想的基础上,采用垂直数据结构和fp-tree技术,对行集建立行fp-tree来挖掘频繁闭合模式。通过实例和实验证明该算法是正确有效的。 Traditional algorithms for mining frequent itemsets are proved to be inefficient and produce many redundant patterns when they are applied to dense datasets or long datasets, such as gene expression datasets. To solve this problem, some researchers propose closed pattern conception and some algorithms. It is proved that these algorithms based on the conception of closed pattern can substantially reduce the number of rules and redundant patterns. According to the characters of biological datasets, a novel algorithm called REMFOR is dlsigned to mine frequent closed pattern. It is based on the conception of closed pattern, using row enumeration and vertical data structure, building row fp-tree on row set to mine frequent closed pattern. And it is proved to be correct and efficient by example and tests.
作者 周明 李宏
出处 《计算机工程》 CAS CSCD 北大核心 2007年第2期74-76,共3页 Computer Engineering
基金 国家自然科学基金资助项目(60433020)
关键词 数据挖掘 频繁项集 闭合模式 Data mining Frequent itemsets Closed pattern
  • 相关文献

参考文献5

  • 1Burdick D,Calimlim M,Gehrke J.MAFIA:A Aaximal Frequent Itemset Algorithm for Transactional Databases[C].Proc.of Intl.Conf.on Data Engineering,2001:325-337.
  • 2Pasquier N,Bastide Y,Taouil R,et al.Discovering Frequent Closed Itemsets for Association Rules[C].Proc.of the 7^th Int'1.Conf.on Database Theory.Jerusalem:Springer-Verlag,1999:398-416.
  • 3Pei J,Han J,Mao R.CLOSET:An Efficient Algorithm for Mining Frequent Closed Itemsets[C].Proc.of Workshop on Data Mining and Knowledge Discovery.Dallas:ACM Press,2000:21-30.
  • 4Zaki M J,Hsiao C J.CHARM:An Efficient Algorithm for Closed Itemset Mining[C].Proc.of the 2nd SIAM Int'l.Conf.on Data Mining.Arlington:SIAM,2002:12-28.
  • 5Gao Cong,Jiong Yang,Zaki M J.Carpenter:Finding Closed Patterns in Long Biological Datasets[C].Proc.of SIGKDD'03,2003:413-419.

同被引文献22

  • 1Agrawal R, Srikant R. Fast algorithms for mining association rules[ C ]//VLDB' 94 : 487-499.
  • 2Mannila H, Toivonen H, Verkamo A. Ef? cient algorithms for discovering association rules[C]//KDD'94: 181-192.
  • 3Pasquier N, Bastide Y, Taouil R, Lakhal L. Discovering frequent closed itemsets for association rules[ C ]/// ICDT' 99: 398-416.
  • 4Zaki M, Hsiao C. CHARM: An ef? cient algorithm for closed itemset mining[ C ]//SDM' 02 : 457-473.
  • 5Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation [ C ]//SIGMOD' 00 : 1 - 12.
  • 6Uno T, Asai T, Uchida Y, Arimura H. Lcm ver. 2: Ef? cient mining algorithms for frequent closed maximal itemsets [ C]//FIMI'04.
  • 7Bayardo R. Efficiently mining long patterns from databases [ C ]//SIGMOD' 98: 85-93.
  • 8Gunopulos D, Mannila H, Khardon R, Toivonen H. Data mining, hypergraph transversals, and machine learning [ C ] //PODS'97:209 - 219.
  • 9Cong G, Tan K, Tung A K H, Xu X. Mining top-kcovering rule groups for gene expression data [ C ] // SIGMOD'05 : 670-681.
  • 10Wang J, Han J, Lu Y, Tzvetkov P. TFP: An ef? cient algorithm for mining top-k frequent closed item sets [ J ]. TKDE, 2005,17: 652-664.

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部