期刊文献+

DNA微阵列数据特征提取的分类方法研究 被引量:1

Method of extracting features from DNA microarray data for classification
下载PDF
导出
摘要 常用的排列方法从DNA微数据中选择的基因集合往往会包含相关性较高的基因,而且使用单个基因评价方法也不能真正反映由此得到的特征集合分类能力的优劣。另外,基因数量远多于样本数量是进行疾病诊断面临的又一挑战。为此,提出一种DNA微阵列数据特征提取方法用于组织分类。该方法运用K-means方法对基因进行聚类分析,获取各子类DNA微阵列数据中心,用排列法去除对分类无关的子类,然后利用ICA方法提取剩余子类集合的特征,用SVMs方法构造分类器对组织进行分类。真实的生物学数据实验表明,该方法通过提取一种复合基因,能综合评价基因分类能力,减少特征数,提高分类器的分类准确性。 Gene sets of interest typically selected by usual ranking methods from DNA microarray data will contain many highly correlated genes,and using the evaluating method of single gene does not reflect really the capacity of classifier of character sets.And disease diagnostics based on gene expression microarray data presents another major challenge due to the number of genes far exceeding the number of samples.So a method of extracting DNA microarray data features for the tissue classification is proposed.The method makes use of K-means to cluster analysis for genes,getting the DNA microarray data centers of every subclass,then uses ranking methods to get grid of the genes not useful for classification.Then,the features of the remaining subclass sets are extracted by ICA,thus a classifier is structured by SVMs for tissues classification.Real biological data experiments show that the method can evaluate the classification capacity of genes,decrease the number of features and increase the classification accuracy of the existing classifiers by extracting a compound gene.
出处 《计算机工程与应用》 CSCD 北大核心 2010年第28期40-42,共3页 Computer Engineering and Applications
基金 国家社会科学基金No.08CTQ003 广东省自然科学基金No.2008276 华南农业大学校长基金No.4900-K06166 重庆市科委重点攻关项目No.2008AC0043~~
关键词 DNA微阵列 特征提取 独立成分分析(ICA) 聚类分析 支持向量机(SVMs) DNA microarray extracting feature Independent Components Analysis (ICA) clustering analysis Support Vector Machines (SVMs)
  • 相关文献

参考文献13

  • 1Schena M, Shalon D,Davis R W,et al.Quantitative monitoring of gene expression patterns with a complementary DNA microarray[J].Science, 1995,270( 5235 ) : 467-470.
  • 2Lockhart D J,Dong H,Byrne M C,et al.Expression monitoring by hybridization to high-density oligonucleotide arrays[J].Nat Biotechnol, 1996,14(13) : 1675-1680.
  • 3Khan J,Wei J S,Ringner M,et al.Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J].Nature Medicine,2001,7(6):673-679.
  • 4Paul T K, Iba H.Prediction of cancer class with majority voting genetic programming classifier using gene expression data[J]. IEEE-ACM Transactions on Computational Biology and Bioinformatics, 2009,6(2) : 353-367.
  • 5Buldini B, Zangrando A.Identification of immunophenotypie signatures by clustering analysis in pediatric patients with philadelphia chromosome-positive acute lymphoblastic leukemia[J].American Journal of Hematology, 2010,85 (2) : 138-141.
  • 6Mei Z, Shen Q.Ye B X.Hybridized KNN and SVM for gene expression data classification[J].Life Science Journal, 2009, 6 (3) :61-66.
  • 7Hong J H, Cho S B.Gene boosting for cancer classification based on gene expression profiles[J].Pattem Recognition, 2009, 42(9) : 1761-1767.
  • 8Dudoit S, Fridlyand J, Speed T.Comparison of discrimination methods for the classification of tumors using gene expression data[J].Journal of the American Statistical Association, 2002,97 : 77-87.
  • 9Li L H, Zhang J G, Neal R M.A method for avoiding bias from feature selection with application to naive Bayes classification models[J].Bayesian Analysis, 2008,3 ( 1 ) : 171-196.
  • 10王明怡,吴平,王德林.基于相关性分析的基因选择算法[J].浙江大学学报(工学版),2004,38(10):1289-1292. 被引量:4

二级参考文献37

  • 1彭红毅,朱思铭,蒋春福.数据挖掘中基于ICA的缺失数据值的估计[J].计算机科学,2005,32(12):203-205. 被引量:9
  • 2彭红毅,蒋春福,朱思铭.基于ICA与SVM的孤立点挖掘模型[J].计算机科学,2006,33(9):175-177. 被引量:7
  • 3Cotes C,Vapnik V.Support vector networks[J].Machine Learning, 1995,20: 273-295.
  • 4Bartlett P L,Taylor J S.Generalization performance on support vector machines and other pattern classifiers[M].Cambridge,MA: MIT Press, 1999.
  • 5Sholkopf B,Sung K,Burges C J C,et al.Comparing support vector machine with Gaussian kernels to radial basis function classifiers[J]. IEEE Trans Signal Processing, 1997,45:2758-2765.
  • 6Vapnik V N.Statistical learning theory[M].[S.l.]:Publishing House of Electronics Industry,2004.
  • 7Sundararaghvan V,Zabaras N.Classification and reconstruction of three-dimensional microstructures using support vector machinos[J]. Computational Materials Science,2005,32:223-239.
  • 8Yao Y,Marcialis G.Combining flat and structured representations for fingerprint classification with recursive neural networks and support vector machines[J].Pattern Recognition,2003,36:397-406.
  • 9Zhan Y,Shen D.Design efficient support vector machine for fast classification[J].Pattern Recognition, 2005,38 : 157-161.
  • 10Rai Y.A simplified approach to independent component analysis[J]. Neural Comput & Applic,2003,12:173-177.

共引文献14

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部