摘要
利用有限个实验条件下的基因表达谱数据,只能对与实验条件相关的基因功能类进行有效预测,所以有必要限定可预测的基因功能类范围。据此,首先基于GeneOntology(GO)选择富集差异表达基因与实验条件相关的功能类。再通过支持向量机分类器,深化预测迄今只注释到实验条件相关功能类的父结点的基因是否属于该实验条件相关功能类。应用于一套酵母基因表达谱数据,结果显示,在剔除了高度不平衡的训练集合后,平均真阳性率(precision)与平均覆盖率(recall)都分别达到了71%与47%以上。
Gene expression profiles under limited experimental conditions can be used to effectively predict only some gene functional classes closely relevant to experimental conditions, so we should select appropriate functional classes for efficient prediction of gene functions. We identify experiment relevant functional classes enriched with differentially expressed genes. By support vector machine classifers,then we predict those genes so far only annotated to the parental functional classes of those experiment relevant functional classes to the pre-selected functional classes. By applying a data set of S.cerevisia, the results show that the mean prediction precisions and recalls, after deleting those highly unbalanced training classes, all get above 70% and 46% respectively.
出处
《生物信息学》
2005年第2期49-52,共4页
Chinese Journal of Bioinformatics
基金
国家自然科学基金(30370798
30170515
30370388)
国家863计划(2003AA2Z2051
2002AA2Z2052)
黑龙江省科技攻关重点(GB03C602-4)
哈尔滨市科技攻关(2003AA3CS113)
黑龙江自然科学基金(F0177)
哈医大211工程"十五"建设项目。