期刊文献+

一种改进的六联体使用频率编码测度 被引量:2

An improved coding measure based on hexamer usage
下载PDF
导出
摘要 在基因预测软件中常用的编码测度得到的序列编码潜力大小往往与序列的C+G含量紧密相关,从而影响了对蛋白编码区的识别效果.研究发现六联体使用偏好与其自身C+G含量存在一种近似线性的相关性,据此提出了一种改进的六联体使用偏好模型,通过综合考虑六联体使用频率与六联体的C+G含量,可简便有效地减小序列编码潜力大小对序列C+G含量的依赖性.测试表明,与分类建模策略相比,该方法所需的训练数据较少,而且具有更好的蛋白编码区识别效果,因此可用于基因预测软件中以提高蛋白编码区与基因结构的预测精度. Statistical characteristics of nucleotide composition are important information to identify protein coding regions. However, coding potentials calculated by some widely used coding measures closely related to sequence C+G content, thus the performance of recognizing protein coding regions is affected. In view of the fact, the strategy of learning parameters from different C+G content reference sets separately, and some famous eukaryotic gene identification programs are adopted in. An improved hexamer usage preference model reducing the dependence of coding potential on C+G content was presented. In proposed algorithm less training data is needed, but better performance of recognizing protein coding regions than the former strategy gained. It is hoped that the algorithm is useful to improve the accuracy of some existing gene-finding programs.
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2005年第7期107-110,共4页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(90203011) 湖北省自然科学基金资助项目(2002AC014).
关键词 编码测度 六联体使用偏好 C+G含量 蛋白编码区识别 基因预测软件 coding measure hexamer usage preference C+G content recognition of protein coding regions
  • 相关文献

参考文献6

  • 1Guigó R, Fickett J W. Distinctive sequence features in protein coding genic non-coding, and intergenic human DNA[J]. J Mol Biol, 1995, 253(1): 51-60
  • 2Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA[J]. J Mol Biol, 1997, 268(1): 78-94
  • 3Rogic S, Mackworth A, Ouellette F B. Evaluation of gene-finding programs on mammalian sequences[J]. Genome Research, 2001, 11(5): 817~832
  • 4Snyder E E, Stormo G D. Identification of protein coding regions in genomic DNA[J]. J Mol Biol, 1995, 248(1): 1-18
  • 5周艳红,杨雷,王卉,陆枫,万宏辉.基于多级优化的真核生物基因结构预测[J].科学通报,2004,49(2):140-145. 被引量:7
  • 6Yan M, Lin Z S, Zhang C T. A new Fourier transform approach for protein coding measure based on the format of the Z curve[J]. Bioinformatics, 1998, 14(8): 685-690

二级参考文献2

  • 1Burset M Guigó R.Evaluation of gene structure prediction programs[J].Genomics,1996,34(3):353-367.
  • 2Guigó R.Assembling genes from predicted exons in linear time with dynamic programming[J].J Comput Biol,1998,5(4):681-702.

共引文献6

同被引文献14

  • 1周艳红,王卉,杨雷.基于特征挖掘与融合的剪接位点识别[J].华中科技大学学报(自然科学版),2006,34(12):117-120. 被引量:4
  • 2Yu J, Hu S N, Wang J, et al. A draft sequence of the rice genome (Oryza sativa L. ssp. Indica ). Science, 2002, 296 : 79 - 92
  • 3Goff S A, Ricke D, Lan T H et al. A draft sequence of the rice genome (Oryza sativa L. ssp. Japonica). Science, 2002,296:92 - 100
  • 4Yuan Q P, Shu Q, Wang A H, et al. The institute for genomic research osal rice genome annotation database. Plant Physiol,2005,138 : 18- 26
  • 5Zhang M. Q., Warr, T. G. A weight array method for splicing signal analysis. Comput Appl Biosci, 1993, 9 : 499 - 509
  • 6Ma C, Zhou D, Zhou Y H. Feature mining and integration for improving the prediction accuracy of translation initiation sites in eukaryotic mRNAs, in. Xiao N, Buyya R, Liu Y H, et al. eds. Proceedings of the 5th International Conference on Grid and Cooperative Computing Workshops GCCW2006. Hunan, China. 2006. IEEE Computer Society, 2006. 349 - 356
  • 7Pertea M. , Lin X. , Salzberg S. L. GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res. ,2001,29(5) :1185 - 1190
  • 8Zhou Y H, Zhang H L, Yang L, et al. Improving the prediction accuracy of gene structures in eukaryotic DNA with low C + G contents. International Journal of Information Technology,2005,11 (8) : 17 - 25
  • 9Zhang X H F, Leslie C S and Chasin L A. Dichotomous splicing signals in exon flanks. Genome Res, 2005, 15: 768-779
  • 10Zhang M Q. Statistical features of human exons and their flanking regions. Hum Mol Genet, 1998, 7(5) : 919 - 932

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部