蜡状芽胞杆菌ATCC 10987对于细菌比较基因组学的研究有重要意义.但其染色体中基因的数目被RefSeq注释为5 603个,这个注释是有疑问的.本文采用Zcurve和Glimmer程序联合打分的方法来识别其蛋白质编码基因.为保证预测结果的可靠性,对联合判别附加预测的基因使用了BLAST方法进行数据库同源性搜索.结果,蜡状芽胞杆菌ATCC 10987基因组中的蛋白质编码基因的数目被重新确定为5 180个.这个数目明显低于原始注释的数目,并且一些指标表明新的注释更为可信.这些相对正确的基因集合为该细菌亲缘物种的深入研究提供了基础.
Bacillus cereus ATCC 10987 is a significant organism for comparative genomics in bacteria. The 5 603 ORFs originally annotated as potential genes by RefSeq in the chromosome seem to be questionable. In this paper, protein-coding genes were identified by joint applications of Zcurve and Glimmer program. To verify the additional ORFs which are not included in original annotation, we utilize the method of BLAST database search for better accuracy. Consequently, the number of re-annotated protein-coding genes in Bacillus cereus ATCC 10987 genome is 5 180, which is evidently less than 5 603 according to RefSeq annotation and more authentic. These genes then become the basis for much further study into the biology of relative organisms.
Journal of Tianjin University of Technology