摘要
GO(GeneOntology)是个标准化的生物信息本体库,被广泛地用来注释基因数据库,然而由于GO结构设计上的缺陷以及目前对基因数据库注释方法多采用手工方式,再加上基因的许多特性尚未发现,使得这种注释还不完全。该文尝试用概率决策树的方法来学习得到基因和GO本体的内在联系,进而预测基因的本体注释情况,也就是预测基因的未知特性,这样就可以引导基因数据库管理员去完善,修正基因数据库的本体注释,并指导生物学家有针对性地设计试验。作为一个应用,用MGI基因数据数据库做试验,分析表明用该方法得到的预测结果准确性比较高。
GO(Gene Ontology)is a standardized Bioinformatics Ontology,which is widely used to annotate Gene Database.However,such annotation is far from complete due to the structural defect of GO and the manual annotation method in most cases together with some undiscovered attributes of genes.In this paper,we try to get the underlying relationship between gene and gene ontology through the machine learning method of Probabilistic Decision Tree,so as to predict the annotation of gene ontology,i.e.the unknown attributes of gene.In this way,gene database administrators can be guided to complete and modify the ontology annotation and biologists can be guided to design specific experiments.The MGI gene database is used to test the precision of our prediction and the experimental results show that our method achieves a higher precision.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第25期167-170,共4页
Computer Engineering and Applications
基金
国家自然科学基金项目(编号:60273045)
科学技术部基础研究重大项目前期研究项目(编号:2001CCA0300)
上海市科技发展基金项目(编号:025115032)资助
关键词
机器学习
概率决策树
本体
生物信息
基因
machine learning,probabilistic decision tree,Ontology,bioinformatics,Gene