期刊文献+

HTCLOSE:快速挖掘微阵列数据集中的频繁闭合模式

HTCLOSE:Efficient Mining Frequent Closed Patterns in Microarray Datasets
下载PDF
导出
摘要 由于微阵列数据集行(样本)少列(基因)多的特征,使得采用传统列枚举方法对其进行频繁闭合模式挖掘较为困难.基于行枚举方法,提出超链接结构HT-struct,并基于该结构提出频繁闭合模式挖掘新算法HTCLOSE.算法采用深度优先搜索策略,结合高效的修剪技术和巧妙的链表组织技术,在时间和空间上均得到了优化.实验表明,HTCLOSE算法通常快于行枚举算法CARPENTER. Because the microarray datasets contain a large number of columns (genes) and a small number of rows (samples), mining frequent closed patterns in microarray datasets pose a great challenge for traditional algorithms based on the column enumeration space. Based on the row enumeration space, a hyperlink structure,HT-struct was suggested and a new algorithm, HTCLOSE was proposed for mining frequent closed patterns. HTCLOSE searched the row enumeration space in depth-first, combined efficient pruning and ingenious hyperlink organizing. Several experiments on real-life microarray datasets showed that HTCLOSE was faster than CARPENTER, an algorithm based on the row enumeration space.
出处 《小型微型计算机系统》 CSCD 北大核心 2008年第2期274-278,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金重点项目(60533020)资助
关键词 数据挖掘 关联规则 频繁闭合模式 微阵列数据集 生物信息学 data mining association rules frequent closed pattern microarray dataset bioinformatics
  • 相关文献

参考文献9

  • 1Sun X,Lu Z H,Xie J M. Foundation of bioinformaties[M]. Beijing :TsingHua University Press, 2005.
  • 2Madeira S,Oliveira A. Biclustering algorithm for biological data analysis:a survey [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004,1,24-45.
  • 3Creighton C,Hanash S. Mining gene expression databases for association rules [J]. Bioinformatics, 2003,19 :79-86.
  • 4Pasquier N, Bastide Y, Taouil R. Discovering frequent closed itemsets for association rules[C]. Proceedings of Int'l Conf. On Database Theory, Jan, 1999,398-416.
  • 5刘君强,孙晓莹,庄越挺,潘云鹤.挖掘闭合模式的高性能算法[J].软件学报,2004,15(1):94-102. 被引量:19
  • 6Pan F, Cong G,Tung A,et al. CARPENTER: finding closed patterns in long biological datasets [C]. Proceedings of SIGKDD'03,Washington, D. C:August 2003, 637-642.
  • 7Pei J, Han J. H-Mine: hyper-structure mining of frequent patterns in large databases [C]. Proceedings of ICDM 2001, Nov 2001, 441-448.
  • 8http://www. stjuderesearch. org/data/ALL1/all_datafiles. html.
  • 9http ://www. broad. mit. edu./cgi-bin/cancer/datasets. cgi.

二级参考文献8

  • 1[1]Pasquier N, Bastide Y, Taouil R, Lakhal L. Discovering frequent closed itemsets for association rules. In: Beeri C, et al, eds. Proc. of the 7th Int'l. Conf. on Database Theory. Jerusalem: Springer-Verlag, 1999. 398~416.
  • 2[2]Agrawal R, Srikant R. Fast algorithms for mining association rules. In: Beeri C, et al, eds. Proc. of the 20th Int'l. Conf. on Very Large Databases. Santiago: Morgan Kaufmann Publishers, 1994. 487~499.
  • 3[3]Pei J, Han J, Mao R. CLOSET: An efficient algorithm for mining frequent closed itemsets. In: Gunopulos D, et al, eds. Proc. of the 2000 ACM SIGMOD Int'l. Workshop on Data Mining and Knowledge Discovery. Dallas: ACM Press, 2000. 21~30.
  • 4[4]Burdick D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases. In: Georgakopoulos D, et al, eds. Proc. of the 17th Int'l. Conf. on Data Engineering. Heidelberg: IEEE Press, 2001. 443~452.
  • 5[5]Zaki MJ, Hsiao CJ. CHARM: An efficient algorithm for closed itemset mining. In: Grossman R, et al, eds. Proc. of the 2nd SIAM Int'l. Conf. on Data Mining. Arlington: SIAM, 2002. 12~28.
  • 6[6]Liu JQ, Pan YH, Wang K, Han J. Mining frequent item sets by opportunistic projection. In: Hand D, et al, eds. Proc. of the 8th ACM SIGKDD Int'l. Conf. on Knowledge Discovery and Data Mining. Alberta: ACM Press, 2002. 229~238.
  • 7[7]Srikant R. Quest synthetic data generation code. San Jose: IBM Almaden Research Center, 1994. http://www.almaden.ibm.com/ software/quest/Resources/index.shtml
  • 8[8]Blake C, Merz C. UCI Repository of machine learning. Irvine: University of California, Department of Information and Computer Science, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html

共引文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部