摘要
针对基因表达数据基于表达相似的聚类分析并不能完全揭示基因之间的功能相似问题,结合基因的传输互表达关系,提出基于传输互表达的聚类分析方法.首先用基因的表达相关来构建基因相关图,然后通过最短路分析来获得基因之间传输互表达关系并作为基因的相似测度,再用k-均值聚类算法进行聚类分析.对Yeast基因表达数据进行聚类实验,并与基于表达相似的聚类结果对比.实验结果表明,基于传输互表达的聚类方法能获得更好的聚类性能和较高的聚类正确率,验证基于传输互表达的基因聚类更能揭示基因相似的本质.
Clustering analysis of gene expression data based on similar expression measures can not fully reveal the genetic function similarity between genes. Combined with gene transitive co-expression, a method for clustering analysis based on transitive co-expression is proposed to solve the problem. Firstly, the gene-related graph is built by using coefficient between gene expression profiles. Next, the transitive co-expression relationship between genes is obtained by the shortest path analysis. Then, clustering is performed by using k-means algorithm with transitive co-expression relationship as similarity measure. The experiments on Yeast gene expression data show that the transitive co-expression-based clustering method achieves better clustering performance compared with expression-based clustering method, and the clustering accuracy is significantly higher than that of the expression-based clustering method. The experimental results indicate that the proposed algorithm has better performance in revealing the nature of gene similarity compared with expression-based clustering method.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2012年第6期894-899,共6页
Pattern Recognition and Artificial Intelligence
基金
中央高校基本科研业务费专项资金资助项目(No.K5051203013)
关键词
基因表达数据
聚类
表达相似
功能相似
传输互表达
Gene Expression Data, Clustering, Expression Similarity, Function Similarity, TransitiveCo-Expression