期刊文献+

基于不同分类模型的基因芯片癌症诊断方法研究

Study of different classification models based-on microarray
下载PDF
导出
摘要 基因芯片技术的发展为生物信息学带来了机遇,使在基因表达水平上进行癌症诊断成为可能。但基因芯片数据高维小样本的特征也使传统机器学习方法面临挑战。本文利用真实的基因表达数据,测试了目前主要的分类方法和降维方法在癌症诊断方面的效果,通过实验对比发现:基于线性核函数的支持向量机可以有效地分类肿瘤与非肿瘤的基因表达,从而为癌症诊断提供借鉴。 The development of microarray technology will bring opportunities to bioinformatics and makes it possible to diagnose cancer on the level of gene expression. But the high-dimensional characteristics and small number of samples in microarray data sets also challenges the traditional machine learning methods. In this paper, we compare the effect among the popular classification and dimensionality reduction methods in the diagnosis of cancer using the real gene expression data, the result demonstrates that SVM based on the linear kernel can better classify tumor and non-tumor gene expression, and thereby provide a reference for cancer diagonsis.
出处 《生物信息学》 2013年第3期161-166,共6页 Chinese Journal of Bioinformatics
基金 国家自然科学基金(61001013) 黑龙江省教育厅科学研究项目(12521392) 黑龙江省自然科学基金(F201119)
关键词 基因芯片 癌症诊断 分类 主成份分析 Microarray Cancer Diagnosis Classification Principal Component Analysis
  • 相关文献

参考文献15

  • 1郭茂祖,邹权,李文滨,韩英鹏.生物信息学中的学习问题[J].山东大学学报(工学版),2009,39(3):1-6. 被引量:2
  • 2邹权,郭茂祖,刘扬,王峻.类别不平衡的分类方法及在生物信息学中的应用[J].计算机研究与发展,2010,47(8):1407-1414. 被引量:26
  • 3Li Leping, Weinberg C, Darden T, Pedersen L. Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method [ J ], Bioinformatics, 2001, 17 ( 12 ) : 1131-1142.
  • 4Inza I, Larranage P, Etxeberria R, Sierra B. Feature Subset Se- lection by Bayesian network-based optimization[ J]. Artificial Intel- ligence, 2000, 123(2): 157-184.
  • 5杨帆,林琛,周绮凤,符长虹,罗林开.基于随机森林的潜在k近邻算法及其在基因表达数据分类中的应用[J].系统工程理论与实践,2012,32(4):815-825. 被引量:43
  • 6Paul T, Iba H. Prediction of Cancer Class with Majority Voting Genetic Programming Classifier Using Gene Expression Data [ J ]. IEEE/ACM Transactions on Computational Biology and Bioinfor- matics (TCBB), 2009, 6(2) : 353-367.
  • 7Ghorai S, Mukherjee A, Sengupta S, Dutta P. Cancer Classifica- tion from Gene Expression Data by NPPC Ensemble [ J ]. IEEE/ ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2011, 3(8) : 659-671.
  • 8Zhang Runxuan, Huang Guang-Bin, Sundararajan N, Saratchand- ran P. Multicategory Classification Using An Extreme Learning Machine for Microarray Gene Expression Cancer Diagnosis [ J ]. IEEE/ACM Transactions on Computational Biology and Bioinfor- matics (TCBB), 2007, 4(3) : 485-495.
  • 9Benso A, Carlo S, Politano G. A cDNA Micrearray Gene Expres- sion Data Classifier for Clinical Diagnostics Based on Graph Theory [ J ]. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2011, 8 (3): 577-591.
  • 10Tang Yuchun, Zhang Yanqing, Huang zhen. Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Ex- pression Data Analysis[ J]. IEEE/ACM Transactions on Compu- tational Biology and Bioinformatics ( TCBB ), 2007, 4 ( 3 ) : 365-381.

二级参考文献51

  • 1徐燕,李锦涛,王斌,孙春明,张森.不均衡数据集上文本分类的特征选择研究[J].计算机研究与发展,2007,44(z2):58-62. 被引量:20
  • 2刘胥影,吴建鑫,周志华.一种基于级联模型的类别不平衡数据分类方法[J].南京大学学报(自然科学版),2006,42(2):148-155. 被引量:23
  • 3李建中,杨昆,高宏,骆吉洲,郭政.考虑样本不平衡的模型无关的基因选择方法[J].软件学报,2006,17(7):1485-1493. 被引量:24
  • 4李建伏,郭茂祖.系统发生树构建技术综述[J].电子学报,2006,34(11):2047-2052. 被引量:17
  • 5KNUDSEN B, HEIN J. Using stochastic context flee grammars and molecular evolution to predict RNA secondary structure [J]. Bioinformatics, 1999, 15(6):446-454.
  • 6KNUDSEN B, HEIN J. Pfold: RNA secondary structure prodiction using stochastic context-free grammars[J]. Nucleic Acids Research, 2003, 31 (13) :3423-3428.
  • 7SHEN Hongbin, CHOU Kuochen. Ensemble classifier for protein fold pattern recognition [ J ]. Bioinformatics, 2006, 22 (14) : 1717-1722.
  • 8LI Minghui, WANG Xiaolong, LIN Lei, et al. Effect of example weights on prediction of protein-protein interactions [ J ]. Computational Biology and Chemistry, 2006, 30:386-392.
  • 9MALIK YOUSEF, SEGUN JUNG, LOUISE C SHOWE, et al. Learning from positive examples when the negative class is undetermined microRNA gene identification [ J ]. Algorithms For Molecular Biology, 2008, 3(1):2.
  • 10DING Z J, FENG Y, ZHENG Y G. Granular decision fusion systems for effective protein methylation prediction[ C ]//2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2008). Sun Valley, Idaho: [ s. n. ], 2008.

共引文献68

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部