期刊文献+

基因表达数据聚类分析结果的评价方法研究 被引量:7

An Novel Method for Evaluation of Clustering Results for Gene Expression Data
下载PDF
导出
摘要 目的 本文探讨基因表达数据聚类分析结果的评价方法 ,提供一种最佳聚类结果的判别准则。方法 从数据结构 (内部信息 )和功能分类 (外部信息 )两个方面对聚类结果进行评判。即一方面 ,采用Entropy(信息熵 )评判法 ,考察聚类结果与部分已知功能基因分类的符合程度 ;另一方面 ,采用adjust- FOM评价法 ,从数据结构的本身进行评价。我们综合两种方法得到一种新的评价方法 ,并称此方法为Entropy- FOM评价方法。结果 将该方法应用于Lyer的血清数据集和Ferea的酵母数据集对聚类分析结果进行了评价 ,给出了六种聚类方法的adjust- FOM图和Entropy- FOM图。结论 通过大量计算结果提示 。 Objective Many cluster algorithms have been used to analyze gene expression data. However, little guidance is proposed to evaluate and choose these algorithms. In this study, our purpose is to establish a systematic framework for selecting the best clustering algorithm and provide an evaluation method for clustering analysis of gene expression data.Methods Based on data structure (internal information) and function classification (external information), the evaluation of gene expression data analysis is carried out by two approaches. Firstly, in order to examine the predictive power of clustering algorithms, Entropy is used to measure the consistency between the clustering results from different algorithms and the known and validated functional classifications (the external classification information). Secondly, a modified method of figure of merit (adjust -FOM) is used as internal assessment method. In this method, one clustering algorithm is used to analyze all data but one experimental condition, the remaining condition is used to assess the predictive power of the resulting clusters.Results In this study, we propose a method based on entropy and figure of merit (FOM) to access the results obtained by different algorithms. Six clustering algorithms were evaluated using three gene expression data sets (the Lyer's Serum Data Sets, the Ferea's Saccharomyces Cerevisiae Data Set).Conclusion According to the curve of adjust -FOM and Entropy -FOM, Both SOM and Fuzzy clustering methods show the highest ability to cluster on the three data sets.
出处 《中国卫生统计》 CSCD 北大核心 2002年第6期332-335,共4页 Chinese Journal of Health Statistics
关键词 基因表达 聚类分析 Entropy-FOM评价 Entropy评价 Gene Expression, The Evaluation of Clustering, Adjust -FOM, Entropy
  • 相关文献

参考文献9

  • 1易东,张彦琦,王文昌,张蔚,杨梦苏,黄明辉,方志俊.基于伪F统计量的模糊聚类方法在基因表达数据分析中的应用[J].中国卫生统计,2002,19(3):146-150. 被引量:7
  • 2Brazma A, Vilo J. Gene expression data analysis. FEBS Letters, 2000,480:17-24.
  • 3Eisen MB, Spellman PT, Brown PO, et al. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Science USA, 1998, 95:14863-14868.
  • 4Yeung KY, Haynor DR, Ruzzo WL. Validating clustering for gene expression data. Tech. Rep. Urw-CSE-00-01-01, Dept. of Computer Science and Engineering, University of Washington, 2000.
  • 5Mangasarian OL. Mathematical programming in data mining. Data Mining and Knowledge Discovery, 1997,12(1):183-201.
  • 6Scott C, Surya N, Sun P, et al. Adaptive fuzzy leader clustering of complex data sets in pattern recognition. IEEE Trans. Neural Networks, 1992, 3(5): 145-165.
  • 7Anderberg, MR. Cluster analysis for applications. Academic Press, 1973.
  • 8Feree T L. Systematic changes in gene expression patterns following adaptive evolution in yeast. Proc. Natl. Acad. Sci. USA, 1999, 96:9721-9726.
  • 9Lyer R, V. The transcription program in the response of human fibroblasts to serum. Science, 1999, 283:83-87.

二级参考文献1

  • 1楼顺天 于卫 等.MATLAB程序设计语言[M].西安电子科技大学出版社,1998.1-193.

共引文献6

同被引文献45

引证文献7

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部