期刊文献+

基于多目标EDA的特征基因选择

Gene selection with MOEDA
下载PDF
导出
摘要 基因(特征)数远大于条件(样本)数,基因表达数据中往往存在大量噪声,并且生物学或医学工作者期望能从大量的基因中挑选出与疾病诊断有关的标志基因,因此,应用基因表达数据进行疾病分类预测的关键环节是基因选择。目前常用的方法有过滤法和缠绕法。结合过滤法和缠绕法的优点,提出基因选择的多目标分布估计算法(MOEDA)。首先通过打分函数确定MOEDA的候选基因集合,在确定候选基因后,MOEDA通过对KNN分类器的多个性能指标及基因数目等多个目标进行优化,从候选基因中选取综合区分能力最强的特征基因子集。儿童小圆蓝细胞肿瘤数据SRBCT上的实验结果表明,本方法在不需要设置复杂参数的情况下,从2000个基因中仅选取了7个基因,就使分类器在独立测试集上的分类精度达到95%。 The number of genes is usually much more than that of patient samples. Meanwhile, influenced by systematical error, technique limitation and so on, much noise exists in the gene expression data. Moreover, in the view of biological scholars, they want to find a small group of biomarker genes from the raw dataset, which could help them find the relationship between genes and cancers. Therefore, it is necessary to select optimal genes from the raw dataset in the prognosis and diagnosis of cancers. This paper integrated above two gene selection strategies and proposed MOEDA to select final optimal genes. First, a process filtered the raw dataset to reserve genes with high evaluation score. Taking accuracy, sensitivity and scale into account, MOEDA optimized these objectives for KNN and produce final optimal genes. None of complex parameter setting, the experiment on the dataset SRBCT gets 95% accuracy on the independent testing set with 7 genes selected from the 2 000 genes.
出处 《计算机应用研究》 CSCD 北大核心 2009年第8期2891-2894,共4页 Application Research of Computers
基金 国家自然科学基金资助项目(60773010)
关键词 分类预测 基因选择 多目标演化 classification gene selection multi-objective estimation of distribution algorithm(MOEDA)
  • 相关文献

参考文献9

  • 1DEHK,ARGAWAL S,PRATAP A,et al.Afast and elist non-domi-nated sorting genetic algorithm for multi-objective optimization:NS-GA-Ⅱ[].IEEE Trans on Evol Comput.2002
  • 2M HLENBEIN H,PAASS G.From recombination of genes to the es-timation of distributions I.binary parameters[].Proc of the th In-ternational Conference on Parallel Problem Solving from Nature.1996
  • 3Duggan DJ,Bittner M,Chen Y,et al.Expression profiling using cDNA microarrays[].Nature Genetics.1999
  • 4Khan J,Wei JS,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[].Nature Medicine.2001
  • 5Dudoit S,Fridlyand J,Speed T P.Comparison of discrimination methods for the classification of tumors using gene expression data[].Journal of the American Statistical Association.2002
  • 6Breitling,R.,Armengaud,P.,Amtmann,A.,Herzyk,P.Rank products: A simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments[].FEBS Letters.2004
  • 7Efron B,Tibshirani R,Storey JD,et al.Empirical Bayes analysis of a microarray experiment[].Journal of the American Statistical Association.2001
  • 8T.Jirapech-Umpai,S.Aitken.Feature selection and classification for microarray data analysis:Evolutionary methods for identifying predictive genes[].BMC Bioinformatics.2005
  • 9OOI C H,TAN P.Genetic algorithms applied to multi-class prediction for the analysis of gene expression data[].Bioinformatics.2003

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部