期刊文献+

基于优化的邻域粗糙集的混合基因选择算法 被引量:7

Hybrid Gene Selection Algorithm Based on Optimized Neighborhood Rough Set
下载PDF
导出
摘要 DNA微阵列技术可以同时检测细胞内成千上万的基因的活性,被广泛应用于重大基因疾病的临床诊断。然而微阵列数据通常具有高维小样本特点,且存在大量噪声和冗余基因。为了进一步提高微阵列数据分类性能,提出一种特征基因混合选择算法。首先采用ReliefF算法剔除大量无关基因,获得特征基因候选子集;然后采用基于差分进化算法优化的邻域粗糙集模型实现特征基因选择;最后利用支持向量机进行分类,以验证算法的有效性。仿真实验结果表明,该算法能用尽可能少的特征基因来获得更高的分类精度,既增强了算法的泛化性能,又提高了时间效率,而且对致病基因的临床诊断有着重要的参考意义。 DNA microarray technique can detect tens of thousands of gene activity in cells,which has been widely used in clinical diagnosis. However, microarray data has high dimension, small sample, a lot of noise and redundant genes. In order to further improve the classification performance, this paper proposed a hybrid gene selection algorithm. Firstly, using ReliefF algorithm to eliminate a lot of irrelevant genes, the feature genes candidate set was obtained. Then the optimized neighborhood rough set model based on differential evolution algorithm was used to select feature genes. At last the validity of the algorithm was verified using support vector machine as classifier. The simulation results show that the algorithm can obtain higher classification accuracy with less feature gene, and it not only enhances the generalization performance of the algorithm,but also improves the time efficiency.
出处 《计算机科学》 CSCD 北大核心 2014年第10期291-294,316,共5页 Computer Science
基金 国家自然科学基金(81160183 11305097) 陕西省教育厅科研计划项目(147JK1132) 陕西理工学院基金(SLGKY13-41 SLGKY13-44)资助
关键词 特征基因选择 RELIEFF算法 邻域粗糙集模型 差分进化算法 Feature gene selection, ReliefF algorithm, Neighborhood rough set model, Differential evolution algorithm
  • 相关文献

参考文献23

  • 1Derisi J L,Iyer V R,Brown P O.Exploring the metabolic andgenetic control of gene expression on a genomics [J].Science,1997 ,278(5338):680-686.
  • 2Zhao Y H,Wang G R,Li Y,et al.Finding novel diagnostic gene patterns based on interesting non-redundant contrast sequence rules[C]∥ International Conference on Data Mining.2011:972-981.
  • 3Zhao Y H,Yin Y,Wang G R.Identifying top-k vital patterns from multiclass medical data[C]∥BioMedical Information Engineering.2009:536-39.
  • 4Zhao Y H,Yu X J,Wang G R,et al.Maximal subspace coregulated gene clustering[J].IEEE Transactions on Knowledge and Data Engineering,2008,0(1):83-98.
  • 5Golub T R,Slonim D K,Tamayo P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286:531-537.
  • 6Arfin S M,Long A D,Ito E T.Global gene expression profiling in esherichia coliK12:the effects of integration host factor[J].Journal of Biological Chemistry,2000,275:29672-29684.
  • 7Tusher V G,Tibshirani R,Chu G.Significance analysis of microarrays applied to the ionizing radiation response[J].PNAS,2001,98:5116-5121.
  • 8Pan W.A Comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments [J].Journal of Bioinformatics,2002,18:546-554.
  • 9Kira K,Rendell L A.A practical approach to feature selection[C]∥Proceedings of the Ninth International in Machine Lear-ning Conference.1992:145-156.
  • 10Kononenko I.Estimating attributes:analysis and extensions of RELIEF[C]∥Proceedings of the European Conference on Machine Learning,Lecture Notes in Computer Science.1994,4:171-182.

二级参考文献86

共引文献434

同被引文献67

  • 1李霞,张田文,郭政.一种基于递归分类树的集成特征基因选择方法[J].计算机学报,2004,27(5):675-682. 被引量:26
  • 2张文修 ,仇国芳 ,吴伟志 .粗糙集属性约简的一般理论[J].中国科学(E辑),2005,35(12):1304-1313. 被引量:37
  • 3王洪春,彭宏.一种基于主成分分析的异常点挖掘方法[J].计算机科学,2007,34(10):192-194. 被引量:14
  • 4周昉,何洁月.生物信息学中基因芯片的特征选择技术综述[J].计算机科学,2007,34(12):143-150. 被引量:20
  • 5Golub T R,Slonim D K, Tamayo P, et al. Class discovery and class prediction by gene expression monitoring[J]. Science, 1999, 286: 531-537.
  • 6Zhao Y H,Yu X J, Wang G R, et al. Maximal subspace coregulated gene clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2008, 20 (1):83-98.
  • 7Chen X W. Margin-based wrapper methods for gene identification using microarray[J].Neurocomputing, 2006,69(18) 2236-2243.
  • 8Ram6n D U, Sara A A. Gene selection and classification of microarray data using random forest[J]. BMC Bioinformatics 2006(7)t3-4.
  • 9Ma Shuangge, Song Xiao, Huang Jian. Supervised group Lasso with applications to microarray data analysls[J]. BMC Bioin- formatics, 2007(8): 60.
  • 10Chen T. Classification algorithm on gene expression profile of tumor using neighborhood rough set and support vector ma- chine[J]. Advanced Materials Research, 2014, 850: 1238-1242.

引证文献7

二级引证文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部