期刊文献+

一种改进的谱聚类算法及其在基因表达谱分析中的应用 被引量:2

An improved spectral clustering algorithm and its application in gene expression profile analysis
下载PDF
导出
摘要 聚类分析是从基因表达谱数据中提取生物医学信息的主要方法之一.针对传统谱聚类算法无法确定聚类个数的问题,提出一种改进的谱聚类算法并将其应用于基因表达谱聚类分析.首先用基因表达谱数据构造Laplacian矩阵,经特征值分解后得到相应的特征值和特征向量,用谱隙来描述相邻特征值的差值;然后通过寻找谱隙序列的最大值来确定聚类个数;最后从单位化的特征向量着手实现数据类别的划分.通过模拟数据与癌症数据的实验,证明了该文算法的有效性. Cluster analysis is one of the main methods for extracting biomedical information from gene expression profile datas. To dispose of the problem that traditional spectral clustering algorithm could not determine the clustering number, an improved spectral clustering algorithm was proposed and it was applied in the cluster analysis of gene expression profile datas. This'~ algorithm first constructed normalized laplacian matrix with gene expression profile datas and obtained the corresponding eigenvalues and eigenvectors through eigenvalue decomposition. The difference between the adjacent eigenvalues was described with eigengap. Then, the clustering number was determined by searching the maximum of eigengap sequence. Finally, the clustering problem was solved by directly using unit eigenvector. The experiments on simulation data and cancer data demonstrated the validity of this algorithm.
出处 《安徽大学学报(自然科学版)》 CAS 北大核心 2012年第5期67-72,共6页 Journal of Anhui University(Natural Science Edition)
基金 国家自然科学基金资助项目(60772121) 安徽省自然科学基金资助项目(1208085MF93) 安徽大学"211工程"学术创新团队基金资助项目(KJTD007A)
关键词 谱聚类 谱隙 LAPLACIAN矩阵 基因表达谱 spectral clustering eigengap Laplacian matrix gene expression profile
  • 相关文献

参考文献14

  • 1Singh D, Febbo P G, Ross K, et al. Gene expression correlates of clinical prostate cancer behavior[ J ]. Cancer Cell, 2002,1 (2) : 203-209.
  • 2Weight B, Baehner F L, Reis J S. The contribution of gene expression profiling to breast cancer classification, prognostication and prediction : a retrospective of the last decade [ J ]. The Journal of Pathology, 2010,220 ( 2 ) :263 - 280.
  • 3庄振华,王年,李学俊,梁栋,王继.癌症基因表达数据的熵度量分类方法[J].安徽大学学报(自然科学版),2010,34(2):73-76. 被引量:9
  • 4阮晓钢,晁浩.肿瘤识别过程中特征基因的选取[J].控制工程,2007,14(4):373-375. 被引量:15
  • 5Tari L, Baral C, Kim S. Fuzzy c- means clustering with prior biological knowledge [ J ]. Journal of Biomedical informatics ,2009,42 ( 1 ) :74-81.
  • 6Patterson A D, Li H, Eichler G S, et al. UPLC-ESI-TOFMS-based metabolomies and gene expression dynamics inspector self-organizing metabolomic maps as tools for understanding the cellular response to ionizing radiation[ J]. American Chemical Society ,2008,80 (3) :665-674.
  • 7Laszlo M, Mukherjee S. Minimum spanning tree partitioning algorithm for mieroaggregation [ J ]. IEEE Transactions on Knowledge and Data Engineering,2005,17 ( 7 ) :902-911.
  • 8Bai X, Yang X, Latecki L J, et al. Learning context-sensitive shape similarity by graph transduction[ J]. IEEE Transactions on Pattern Analysis and Machine Intelligence ,2010,32(5 ) :861-874.
  • 9Ng A Y, Jordan M I, Weiss Y. On spectral clustering: analysis and an algorithm [ C ]. Dietterich T G, Becker S, Ghahramani Z. Advances in Neural Information Processing Systems. Boston: MIT Press,2002:849-856.
  • 10赵凤,焦李成,刘汉强,公茂果.半监督谱聚类特征向量选择算法[J].模式识别与人工智能,2011,24(1):48-56. 被引量:29

二级参考文献48

  • 1李颖新,刘全金,阮晓钢.一种肿瘤基因表达数据的知识提取方法[J].电子学报,2004,32(9):1479-1482. 被引量:13
  • 2司文武,钱沄涛.一种基于谱聚类的半监督聚类方法[J].计算机应用,2005,25(6):1347-1349. 被引量:11
  • 3郎显宇,陆忠华,迟学斌.一种基于“基因表达谱”的并行聚类算法[J].计算机学报,2007,30(2):311-316. 被引量:11
  • 4阮晓钢,晁浩.肿瘤识别过程中特征基因的选取[J].控制工程,2007,14(4):373-375. 被引量:15
  • 5Golub T R, Slonim D K, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring [ J ]. Science, 1999,286 : 531 - 537.
  • 6Singh D, Febbo P G, Ross K, et al. Gene expression correlates of clinical prostate cancer behavior [ J ]. Cancer Cell ,2002,1:203 - 209.
  • 7Eisen M B, Spellman P T, Brown P O, et al. Cluster analysis and display of genome-wide expression pattenrs [ J ]. Proc Natl Acad Sci USA, 1998,95 ( 25 ) : 14863 - 14868.
  • 8Brazma A, Vilo J. Gene expression data analysis[ J]. FEBS Letters,2000,480( 1 ) :1724.
  • 9Anderw D K, Michel S chummer, Lee H, et al. Bayesian classification of DNA array expression data[ R]. Technical Report UW-CSE,2000.
  • 10Zhou X B, Wang X D, Dougherty E R. A Bayesian approach to nonlinear porbit gene selection and classification[ J], Journal of the'Franklin Institute,2004,341 (1,2) :137 -156.

共引文献48

同被引文献33

  • 1袁远,季星来,孙之荣,李衍达.Isomap在基因表达谱数据聚类分析中的应用[J].清华大学学报(自然科学版),2004,44(9):1286-1289. 被引量:11
  • 2高琰,谷士文,唐琎,蔡自兴.机器学习中谱聚类方法的研究[J].计算机科学,2007,34(2):201-203. 被引量:31
  • 3王玲,薄列峰,焦李成.密度敏感的谱聚类[J].电子学报,2007,35(8):1577-1581. 被引量:61
  • 4GOLUB T R,SLONIM D K,TAMAYO P,et al.Molecular classification of cancer:class discovery and class prediction by gene expression monitoring[J].Science,1999,286(5439):531-537.
  • 5ZHAO Yuhai,YU J X,WANG Guoren,et al.Maximal subspace coregulated gene clustering[J].IEEE Trans Knowl Data Eng,2008,20(1):83-98.
  • 6WIGLE D A,JURISICA I,RADULOVlCH N,et al.Molecular profiling of non-small cell lung cancer and correlation with disease-free survival[J],Cancer Res,2002,62(11):3005-3008.
  • 7DUDOIT S,FRIDLYAND J.A prediction-based resampling method for estimating the number of clusters in a dataset[J].Genome Biol,2002,3(7):0036.1-0036.21.
  • 8SMOLKIN M,GHOSH D.Cluster stability scores for microarray data in cancer studies[J].BMC Bioinformatics,2003,4:36.
  • 9MONTl S,TAMAYO P,MESIROV J,et al.Consensus clustering:a resamlping-based method for class discovery and visualization of gene expression microarray data[J].Mach Learn,2003,52(1-2):91-118.
  • 10HANDL J, KNOWLES J,KELL D B.Computationai cluster validation in post-genomic data analysis[J].Bioinformatics,2005,21(15):3201-3212.

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部