期刊文献+

面向高维微阵列数据的集成特征选择算法 被引量:2

An ensemble feature selection algorithm for high dimensional microarray data
下载PDF
导出
摘要 特征选择算法是微阵列数据分析的重要工具,特征选择算法的分类性能和稳定性对微阵列数据分析至关重要。为了提高特征选择算法的分类性能和稳定性,提出一种面向高维微阵列数据的集成特征选择算法来弥补单个基因子集信息量的不足,提高基因特征选择算法的分类性能和稳定性。该算法首先采用信噪比方法选择若干区分基因;然后对每个区分基因利用条件信息相关系数评估候选基因与区分基因的相关性,生成多个相关基因子集,最后,通过集成学习技术整合多个相似基因子集。实验结果表明,本文提出的集成特征选择算法的分类性能以及稳定性在多数情况下均优于只选择单个基因子集的方法。 Feature selection algorithms are an important tool for mieroarray data analysis, thus their classification ability and stability are essential for data analysis. We propose an ensemble feature selection algorithm for high dimensional microarray data to compensate for the lack of information on a single gene subset. We firstly adopt the signal noise ratio method to select discriminative genes, and then generate relevant gene subsets by evaluating the correlation between the candidate gene and discriminative gene through conditional correlation coefficients. We finally integrate resemblant gene subsets through the ensemble learning technology. Experimental results show that in most cases the classification ability and stability of the proposed algorithm is superior to those that select only a single gene subset.
作者 孙刚 张靖
出处 《计算机工程与科学》 CSCD 北大核心 2016年第7期1330-1337,共8页 Computer Engineering & Science
基金 国家自然科学基金(51174257/F030504) 中央高校基本科研业务费专项资金(2013BHZX0040) 安徽省级科研机构委托专项重点项目(2013WLGH01ZD)
关键词 微阵列数据 信噪比 条件相关系数 特征选择 microarray data signal noise ratio conditional correlation coefficient feature selection
  • 相关文献

参考文献2

二级参考文献66

  • 1de Sa Marques J P. Pattern Recognition Concepts, Methods and Applications. Berlin, Germany: Springer-Verlag, 2002
  • 2Ganeshanandam S, Krzanowski W J. On Selecting Variables and Assessing Their Performance in Linear Discriminant Analysis. Australian Journal of Statistics, 1989, 31(3):433-447
  • 3Theodoridis S, Koutroumbas K. Pattern Recognition. 2nd Edition. New York, USA:Elsevier, 2003
  • 4Dougherty E R. Small Sample Issues for Microarray-Based Classification. Comparative and Functional Genomics, 2001, 2 (1) : 28-34
  • 5Dougherty E R, Shmulevich I, Bittner M L. Genomic Signal Processing: The Salient Issues. EURASIP Journal on Applied Signal Processing, 2004, 4(1): 146-153
  • 6Kim S, Dougherty E R, Barrera J, et al. Strong Feature Sets from Small Samples. Journal of Computational Biology, 2002, 9 (1): 127-146
  • 7Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, USA: Springer-Verlag, 2001
  • 8Webb R A. Statistical Pattern Recognition. New York, USA: John Wiley & Son, 2002
  • 9Dudoit S, Fridlyand J, Speed T P. Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of the American Statistical Association,2002, 97(457):77-87
  • 10Adam B L, Vlahou A, Semmes O J, et al. Proteomic Approaches to Biomarker Discovery in Prostate and Bladder Cancers. Proteomics, 2001, 1(10): 1264-1270

共引文献95

同被引文献9

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部