摘要
DNA微阵列技术可以同时检测细胞内成千上万的基因的活性,被广泛应用于重大基因疾病的临床诊断。然而微阵列数据通常具有高维小样本特点,且存在大量噪声和冗余基因。为了进一步提高微阵列数据分类性能,提出一种特征基因混合选择算法。首先采用ReliefF算法剔除大量无关基因,获得特征基因候选子集;然后采用基于差分进化算法优化的邻域粗糙集模型实现特征基因选择;最后利用支持向量机进行分类,以验证算法的有效性。仿真实验结果表明,该算法能用尽可能少的特征基因来获得更高的分类精度,既增强了算法的泛化性能,又提高了时间效率,而且对致病基因的临床诊断有着重要的参考意义。
DNA microarray technique can detect tens of thousands of gene activity in cells,which has been widely used in clinical diagnosis. However, microarray data has high dimension, small sample, a lot of noise and redundant genes. In order to further improve the classification performance, this paper proposed a hybrid gene selection algorithm. Firstly, using ReliefF algorithm to eliminate a lot of irrelevant genes, the feature genes candidate set was obtained. Then the optimized neighborhood rough set model based on differential evolution algorithm was used to select feature genes. At last the validity of the algorithm was verified using support vector machine as classifier. The simulation results show that the algorithm can obtain higher classification accuracy with less feature gene, and it not only enhances the generalization performance of the algorithm,but also improves the time efficiency.
出处
《计算机科学》
CSCD
北大核心
2014年第10期291-294,316,共5页
Computer Science
基金
国家自然科学基金(81160183
11305097)
陕西省教育厅科研计划项目(147JK1132)
陕西理工学院基金(SLGKY13-41
SLGKY13-44)资助
关键词
特征基因选择
RELIEFF算法
邻域粗糙集模型
差分进化算法
Feature gene selection, ReliefF algorithm, Neighborhood rough set model, Differential evolution algorithm