摘要
利用随机森林-通路分析法,通过袋外样本OOB的分类错误率筛选特征代谢通路,在特征通路上作基因表达相关性研究并对通路上的基因采用MAP(Mining attribute profile)算法挖掘不同实验条件下基因的共调控表达模式,对共调控表达模式进行聚类。分析结果显示同一特征代谢通路上的基因表达倾向相似,有2条特征代谢通路存在共表达模式,其中一条通路含108个表达模式,对这些模式进行聚类,其最低聚类的相似系数仍高达0.623,说明同一特征代谢通路上的基因共表达模式在不同实验条件下仍具有高度的相似性。对以通路作为基因模块进行复杂疾病的研究具有借鉴意义。
We revealed the feature pathways by computing the classification error rates of out-of-bag (OOB) by random forests combined with pathway analysis. At each feature pathway, the relativity of gene expression was studied and the co-regulated gene patterns under different experiment conditions were analyzed by MAP (Mining attribute profile) algorithm. The discovered patterns were also clustered by the average-linkage hierarchical clustering technique. The results showed that the expression of genes at the same pathway was similar. The co-regulated patterns were found in two feature pathways of which one contained 108 patterns and the other contained 1 pattern. The results of clusters showed that the smallest Pearson coefficient of the clusters was more than 0.623, indicating that the co-regulated patterns in different experiment conditions were more similar at the same KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway. The methods can provide biological insight into the study of microarray data.
出处
《生物工程学报》
CAS
CSCD
北大核心
2008年第9期1643-1648,共6页
Chinese Journal of Biotechnology
关键词
基因表达
通路
模式
关联
gene expression, pathway, pattern, association