期刊文献+

基于鉴别主成份分析的基因表达数据特征提取 被引量:2

Gene expression data feature extraction based on discriminant principal component
下载PDF
导出
摘要 针对高维小样本数据特征提取问题,通过融合主成份分析(PCA)和线性判别分析(LDA),提出一种鉴别主成份分析方法。通过对PCA主成份进行单个线性判别,选择主要反应类间差异的主成份来构造特征空间。对yeast和NCI基因表达数据的实验结果表明:该方法在降维的同时能获得较好的判别特征,且能避免线性判别分析方法的奇异性。在子空间的聚类识别率相比PCA提高了20%以上,且具有较好的可视化效果,说明了用该方法对高维小样本数据进行特征提取的有效性。 A method namely Discriminant Principal Component Analysis for feature extraction of small sample ofhigh-dimensionality data is proposed, in which PCA and LDA are fused. The presented method employs linear discriminant analysis for single principal component in the features derived from PCA. In this way, a subspace having the maximum inter-class variation and the minimum intra-class variation is established. The proposed method is used to reduce the dimensions and extract the features ofgene expression data yeast and NCI, and clustering is performed on the reduced data. Experimental results indicates that this method can obtain better discriminant features with reduced dimensions, avoid the singularity of linear discriminate analysis and it outperforms PCA on visualization and recognition accuracy increases more than 20%.
出处 《燕山大学学报》 CAS 2010年第5期426-430,共5页 Journal of Yanshan University
基金 国家自然科学基金资助项目(20875073) 郑州市重大攻关项目(072SGZS38042)
关键词 主成份分析 线性判别分析 子空间 基因表达数据 PCA LDA subspace gene expression data
  • 相关文献

参考文献14

  • 1Pal N R, Sharma A, Sanadhya S K. Deriving meaningful rules from gene expression data for classification [J]. Journal of Intelligent and Fuzzy Systems, 2008,19 (3): 171-180.
  • 2Yeh J Y. Applying data mining techniques for cancer classification on gene expression data [J]. Cybernetics and Systems, 2008,39 (6): 583-602.
  • 3Folino G, Pizzuti C, Spezzano G. Training distributed GP ensemble with a selective algorithm based on clustering and pruning for pattern classification [J]. IEEE Transactions on Evolutionary Computation, 2008,12 (4): 458-468.
  • 4Jiang D, Tang C, Zhang A. Cluster analysis for gene expression data: a survey [J]. IEEE Transactions on Knowledge and Data Engineering, 2004,16 (11): 1370-1386.
  • 5Tarhunen J, Joutsensalo J. Representation and separation of signals using nonlinear PCA type learning [J]. Neural Networks, 1994,7 (1): 113-127.
  • 6Chen Yen-Lun, Zheng Yuan F. Face recognition for target detection on PCA features with outlier information [C]//50th Midwest Symposium on Circuits and Systems, Montreal, 2007: 823-826.
  • 7Zhang Jiulong, Li Peng. Facial feature extraction by curvelet transform and LDA [J]. Journal of Information and Computational Science, 2008,5 (3): 1333-1339.
  • 8Martinez Aleix M, Kak Avinash C. PCA versus LDA [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23 (2): 228-233.
  • 9羊四清,卢新国,易叶青.基于ICA模式空间的基因分类[J].计算机工程与应用,2009,45(23):40-43. 被引量:3
  • 10Belkin M, Niyogi E. Laplacian eigenmaps and spectral techniques for embedding and clustering [C] //Proceedings of Advances in Neural Information Processing System, Vancouver, Canada, 2001: 585-591.

二级参考文献30

共引文献4

同被引文献26

  • 1李玉梅,陈艳秋,李莉.梨品种枝条膜透性和水分状态与抗寒性的关系[J].北方果树,2005(1):3-5. 被引量:27
  • 2郭爱华,陈钰,姚月俊,姚延梼.杏品种抗寒性主成分分析[J].山西农业大学学报(自然科学版),2007,27(3):234-237. 被引量:26
  • 3Yang Ai -jun, Song Xin -yuan. Bayesian variable selection for disease classifcation using gene expression data[ J ]. Bioinfor- mation, 2010, 26(2) : 215 -222.
  • 4Wang An - tai, Gehan E A. Gene selection for microarray data analysis using principal component analysis [ J ]. Statistics in Medicine, 2005, 24(13): 2069-2087.
  • 5Krzanowski W J. Selection of variables to preserve multivariate data structure, using principal components [ J ]. Applied Statis- tics, 1987, 36(1) : 22 - 33.
  • 6Sohn K, Lim S H. A new gene selection method based on PCA for molecular classification [ C ]// Fourth International Confer- ence on Fuzzy System and Knowledge Discovery. Haikou : [ s. n. ] , 2007, 4 : 275 - 279.
  • 7Ghoting A, Parthasarathy S, Otey M E. Fast mining of distance - based outliers in high - dimensional datasets [ J ]. Data Mining and Knowledge Discovery, 2008, 16(3) : 349 -364.
  • 8Chu Wei, Ghahramani Z, Falciani F, et al. Biomarker discovery in microarray gene expression data with Gaussian processes [J]. Bioinformatics, 2005, 21(16): 3385-3 393.
  • 9Wang Hong - qiang, Huang De - shuang. Regulation probability method for gene selection [ J ]. Pattern Recognition Letters, 2006, 27(2) : 116 -122.
  • 10KANG S K,MOTOSUGI H,YONEMORI K. Supercooling characteristics of some deciduous fruit trees as related to water movement within the bud[J].Journal of Horticultural Science & Biotechnology,1998,(02):165-172.

引证文献2

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部