期刊文献+

基于闭合模式的高维生物数据分类算法研究 被引量:1

Research of Classification with Closed Patterns Mining in Long Biological Datasets
下载PDF
导出
摘要 针对基因表达谱数据的特点提出了基于闭合模式的FEALL分类算法.首先对数据进行预处理,剔除表达谱中的无关基因,从而降低FEALL算法的时间复杂度,减少冗余关联规则的产生;然后根据FEALL算法对行集建立行FP-tree,并对每行建立路径枚举树,挖掘出兴趣规则组的上边界,基于上边界建立分类器对样本进行分类预测,无法识别的样本采用权重判断算法进行判断.实验证明FEALL算法具有较高的效率和预测准确率. This paper proposed an algorithm, FEALL, based on closed pattern. We eliminate the irrelevant genes from gene expression dataset before mining of association rules, then according to FEALL we take row enumeration, build row FP-tree and use upper bounder of Interesting Rule Group to establish classifier. The unrecognizable samples are classified by weight-algorithm. FEALL is proved to be correct and efficient by experiments.
出处 《小型微型计算机系统》 CSCD 北大核心 2007年第8期1423-1426,共4页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(60433020)资助
关键词 关联规则 规则组 闭合模式 上边界 association rules rule group closed pattern upper bounder
  • 相关文献

参考文献10

  • 1Burdick D,Calimlim M,Gehrke J.MAFIA:a maximal frequent itemset algorithm for transactional databases[C].In:Intl.Conf.on Data Engineering,April 2001.
  • 2Pasquier N,Bastide Y,Taouil R,et al.Discovering frequent closed itemsets for association rules[C].In:Beeri C,et al,eds.Proc.of the 7th Int'1.Conf.on Database Theory Jerusalem:Springer-Verlag,1999,398-416.
  • 3Golub T R,Slonim D K,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286 (5439):531-537.
  • 4Ramaswamy S,Golub T R.DNA microarrays in clinical oncology[J].Journal of Clinical Oncology,2002,20(7):1932-1941.
  • 5Khan J,Wei J S,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nat Med,2001,7(6):673-679.
  • 6Gao Cong,Jiong Yang,Mohammed J.Zaki.Carpenter:finding closed patterns in long biological datasets[C].SIGKDD '03,August 2427,2003,Washington,DC,USA..
  • 7Pei J,Han J,Mao R.CLOSET:an efficient algorithm for mining frequent closed itemsets[Z].Workshop on Data Mining and Knowledge Discovery.Dallas:ACM Press,2000,21-30.
  • 8Zaki M J,Hsiao C J.CHARM:An efficient algorithm for closed itemset mining[C].Proc.of the 2nd SIAM Int'l.Conf.on Data Mining.Arlington:SIAM,2002,12-28.
  • 9Han J,Pei J,Yin Y.Mining frequent patterns without candidate generation[C].In:Proc.2000 ACM-SIGMOD Int.Conf.Management of Data(SIGMOD'00),1-12,Dallas,TX,May 2000.
  • 10李颖新,阮晓钢.基于支持向量机的肿瘤分类特征基因选取[J].计算机研究与发展,2005,42(10):1796-1801. 被引量:51

二级参考文献12

  • 1T.R. Golub, D. K. Slonim, P. Tamayo, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 1999, 286(5439): 531 ~ 537.
  • 2J. Khan, J. S. Wei, M. Ringner, et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine, 2001, 7(6): 673 ~679.
  • 3I. Guyon, J. Weston, S. Barnhill, et al. Gene selection for cancer classification using support vector machines. Machine Learning, 2000, 46(13): 389~ 422.
  • 4R. Tibshirani, T. Hastie, B. Narasimhan, et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression.PNAS, 2002, 99(10): 6567~6572.
  • 5S. Theodoridis, K. Koutroumbas. Pattern Recognition (2nd edition). New York: Academic Press, 2003. 177~179.
  • 6V.N. Vapnik. Statistical Learning Theroy. New York: Wiley Interscience, 1998.
  • 7M. Dash, H. Liu. Feature selection for classification. Intelligent Data Analysis, 1997, 1(3): 131~156.
  • 8B. José, A. D. Bruce. Feature selection from huge feature sets.In: Proc. 8th Int'l Conf. Computer Vision ( ICCV' 01 ) . Los Alamitos: IEEE Computer Society Press, 2001. 159~ 165.
  • 9E.S. Lander. Array of hope. Nature Genetics, 1999, 21(Suppl): 3~4.
  • 10S. Ramaswamy, T. R. Golub. DNA microarrays in clinical oncology. Journal of Clinical Oncology, 2002, 20 ( 7 ): 1932 ~1941.

共引文献50

同被引文献13

  • 1王鹏,吴晓晨,王晨,汪卫,施伯乐.CAPE——数据流上的基于频繁模式的分类算法[J].计算机研究与发展,2004,41(10):1677-1683. 被引量:7
  • 2Wang J, Karypis G. HARMONY: Efficiently mining the best rules for classification [C] //Proc of 2005 SIAM Conf of Data Mining (SDM'05). 2005: 205-216
  • 3Liu B, Hsu W, Ma Y. Integrating classification and association rule mining [C] //Proc of KDD'98. 1998:80-86
  • 4Li W, Han J, Pei J. CMAR: Accurate and efficient classification based on multiple class-association rules [C] //Proc of ICDM'01. Berlin: Springer, 2001:369-376
  • 5Gosta G, Jianfei Z. Efficiently Using prefix-trees in mining frequent itemsets [C] //Proc of FIMI'04. Piscataway, NJ: IEEE, 2003
  • 6Chi Y, Wang H, Yu P S, et al. Moment: Maintaining closed frequent itemsets over a stream sliding window [C]//Proc of ICDM'04. Piscataway, NJ: IEEE, 2004:59-66
  • 7Pei J, Han J, Wang J. Closet+: Searching for the best strategies for mining frequent closed itemsets [C]//Proc of SIGKDD '03. New York: ACM, 2003
  • 8Burdiek D, Calimlim M, Gehrke J. MAFIA: A maximal frequent itemset algorithm for transactional databases [C] //Proc of the 17tb Int Conf on Data Engineering. Piseataway, NJ: IEEE, 2001:443-452
  • 9Coenen F. LUCS KDD implementation of CMAR [OL]. [2007-10-07J. http://www. esc. liv. ac. uk/-frans/KDD/ Software/CMAR/emar. html, The University of Liverpool
  • 10Blake C L, Merz C J. UCI repository of machine learning databases [OL]. [2007-10-07]. http://www. ics. uci. edu/-mlearn/MLRepository.html

引证文献1

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部