摘要
微阵列数据样本小、维度高的特点给数据分析造成了困难,而主基因的挑选又十分的重要。该文采用遗传算法挑选主基因,其中,用k最邻居距离作为模式识别方法,用支持向量机构造了诊断系统,用不同核函数进行预测分类性能测试。在经典的白血病数据集上,对34个样本的测试集的分类准确率为100%。
Microarray data has the feature of high dimensions and small samples,which causes difficultis to the analysis.Therefore,it is important to select or discover informative genes from microarray data.This paper presents an informative genes selecting method based on genetic algorithm(GA),in which k nearest neighbors(KNN) is implied as a recognition method.Support vector machine(SVM) is used to construct a tumor classifier system and different kernel functions are used to test the performances.This method has been applied to a classic microarray data set(leukemia data) and achieved 100% classification accuracy on the test data set.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第19期204-206,共3页
Computer Engineering
关键词
微阵列数据
基因表达
遗传算法
k最邻居距离
支持向量机
microarray data
gene expression
genetic algorithm
k nearest neighbors
support vector machine(SVM)