期刊文献+

四种模式分类方法应用于基因表达谱分析的比较研究 被引量:3

Research on Pattern Classification Methods Using Gene Expression Data
下载PDF
导出
摘要 利用基因表达谱数据借助于模式分类的方法识别癌症等疾病的类型及不同亚型是DNA芯片技术的一个应用方面。在这篇文章中,我们研究比较了在不同的特征基因选择方法的情况下,Fisher线性判别,Logit非线性判别,最小距离和K-最近邻四种模式分类方法对疾病分型效能的影响及四种模式分类方法的泛化能力,同时研究了在样本构成变化的情况下,模式分类方法的稳定性。结果发现:运用t检验法和分类树选择的特征基因,明显优于随机选择的基因在四种不同的分类器中分类效果;四种分类器中,K最近邻分类器的分类效能最优;基于最小距离的分类器和K最近邻分类器有较强的泛化能力;四种模式分类对样本构成的变化呈较好的稳定性。 One of the applications of cDNA microarrays is to recognize the class and subclass of diseases such as cancers on the basis of statistical pattern classification methods using gene expression data. In this paper, we apply 2000 genes expression dataset provided by Affymatrix Company: 40 samples of intestine cancer tissue and 22 samples of normal tissue. We compare the performance of four pattern classification methods based on different feature selection methods. These pattern classification methods include : Fisher linear discriminate, Logit nonlinear discriminate, the least distance and K-nearest neighbor classifier. The results show firstly that four pattern classifiers based on the feature selection methods of t-test and classification tree all have better performance than those based on the stochastic feature selection methods, secondly that K-nearest neighbor classifier has the best performance, thirdly that both the least distance classifier and K-nearest neighbor classifier have better generalization, fourthly that four classifiers are less sensitive to the composition of samples.
出处 《生物医学工程学杂志》 EI CAS CSCD 北大核心 2005年第3期505-509,共5页 Journal of Biomedical Engineering
基金 国家自然科学基金资助项目(39970397 30170515 30370798) 国家863计划(2002AA222052) 黑龙江科技攻关(GB03C602-4) 黑龙江自然科学基金(F0177) 211工程"十五"建设项目
关键词 分类方法 基因表达谱 癌症 统计分类器 DNA芯片生物学技术 Pattern classifier Feature gene Feature selection
  • 相关文献

参考文献7

  • 1Dudoit S,Fridlyand J, Speed TP. Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 2002;97(457) : 77.
  • 2Lipshutz RJ, Fodor S, Gingeras T,et al. High density synthetic ologonucleotide arrays. Nature genetics, 1999 ; 21 (Suppl):20.
  • 3John GH, Kohavi R, Pfleger K. Irrelevant features and the subset seleetion problem. Maehine Learning,Proeeedings of the 11^th International Conferenee, 1994 ; 121-129.
  • 4李霞,张田文,郭政.一种基于递归分类树的集成特征基因选择方法[J].计算机学报,2004,27(5):675-682. 被引量:26
  • 5Alon U,Barkai N, Notterdam D,et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues probed by oligonueleotide arrays. Cell Biology, 1999;96: 6745.
  • 6边肇祺.模式识别[M].清华大学出版社,1999..
  • 7Park PJ, Pagano M, Bonetti M. A nonparametric scoringalgorithm for identifying informative genes from microarraydata. In:Pacific Symposium on Biocomputing, 2001 : 52-63.

二级参考文献15

  • 1Li X., Rao S.Q. et al.. Genetic mapping of complex discrete human diseases by discriminant analysis. Progress in Natural Science, 2002, 12(6):27~33
  • 2Alon U., Barkai N., Notterman D.A., Gish K. et al.. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences, 1999,96(12):6745~6750
  • 3DeRisi J.L. et al.. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science, 1997, 278:680~685
  • 4Golub T.R. et al.. Molecular Classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 1999, 286:531~537
  • 5Cmill J.C. et al.. A new approach for filtering noise from high-density oligonuleotide microarray datasets. Nucleic Acids Research, 2001, 29(15):15~72
  • 6Hall M.. Correlation-based feature selection for machine learning[Ph.D. dissertation]. Department of Computer Science, University of Waikato, Hamilton, 1998
  • 7Blum A.L., Langley P.. Selection of relevant features and examples in machineearning. Artificial Intelligence, 1997, 97(1~2):245~271
  • 8Kohavi R., John G.H.. Wrappers for feature subset selection. Artificial Intelligence, 1997, 97(1~2):273~324
  • 9Xing E.P., Jordan M.I., Karpy R.M.. Feature selection for high-dimensional genomic microarray data. In:Proceedings of International Conference on Machine Learning, Western Massachusetts, 2001,601~608
  • 10Dietterich T.G.. Ensemble methods in machine learning. In:Proceedings of the 1st International Workshop on Multiple Classifier Systems.In: Roli F. ed.. Lecture Notes in Computer Science. New York: Springer, 2000, 1~15

共引文献85

同被引文献56

引证文献3

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部