摘要
新兴的DNA基因芯片生物技术为从分子水平上研究疾病的发病机理和临床诊断提供了强有力的手段和方法。然而,肿瘤基因数据具有高噪声、高相关性、维数高和样本个数少的特点,这些特点使肿瘤基因数据的分类变得十分困难和复杂。提出了一种基于二叉决策树的肿瘤分类方法,降低了对肿瘤诊断和分类的难度。在4个经典的肿瘤基因数据上用MATLAB软件进行了数值实验,得到了比较高的肿瘤分类准确率,因此我们认为我们的新方法在将来的肿瘤基因数据分类的研究工作中是一种有用的替代工具。
The newly advent of DNA gene chip biological technology has provided powerful tools and methods for studying disease's pathogenesis and clinical diagnosis from molecular level.While the natural properties of the cancer gene data are high noise,high correlation,high dimensionalities and small number of samples.These characteristics make the classification of cancer gene datas very difficult and complex.In this paper,we have proposed a novel classification algorithm that is based on binary decision tree,which reduces the difficulty of cancer diagnosis and classification.When implemented numerical experimentation on four classic cancer gene datas using MATLAB software,the robust cancer classification accuracy has been obtained.Therefore we think our new method will be a useful supplementary tool for the future studies in the application of cancer gene datas classification.
出处
《数理医药学杂志》
2011年第5期517-519,共3页
Journal of Mathematical Medicine
关键词
基因芯片
肿瘤分类
二叉决策树
gene chip
cancer classification
binary decision tree