摘要
目的 探讨基因表达数据聚类结果信息熵评价方法 ,为合理地选择聚类方法提供一种定量评价依据。方法 采用entropy信息熵方法 ,考察常用的六种聚类方法所得到的分类结果与部分已知功能基因分类之间的符合程度 ,并将该指标作为一种评价依据。结果 将该方法应用于Lyer的血清刺激表达数据集聚类结果的评价 ,给出了六种聚类方法的entropy图。结论 本研究首次提出用熵理论评价聚类结果 ,并观察到同一数据集由于功能聚类信息的不同而引起的评价结果的差异。
Objective To establish a systematic framework for the selection of the best clustering algorithm and to provide an entropy evaluation method for clustering analysis of gene expression data. Methods Based on information theory, entropy is used to measure the consistency between the clustering results from six algorithms and the known and validated functional classifications. Results In this study, we applied the entropy method for Lyer's gene expression data. Six entropy curves of clustering algorithms were obtained. Conclusion According to the curve of entropy , both SOM and Fuzzy clustering methods show the highest ability to cluster on these two datasets.
出处
《第三军医大学学报》
CAS
CSCD
北大核心
2004年第4期317-319,共3页
Journal of Third Military Medical University