摘要
随着以功能基因组学和蛋白质组学为主要研究内容的后基因组时代的来临,人们面对着生物信息的数据呈指数增长,如何通过有效的计算方法由核酸和蛋白质的序列推导出它们的结构和功能,特别是识别DNA序列中编码蛋白质的基因预测问题是迫切需要解决的研究课题之一.本文在CpG岛对研究基因编码的特殊生物意义下,通过三种方法确定CpG岛的位置,并在此基础上,结合一种新的DNA序列字母向量,利用信息熵离散量预测基因序列,提高了识别基因编码的效率,而且计算的时间有显著的减少.
As the coming of the post genome era with functional genomlcs anO proteomms as the main research contents, people must face the biological information data growing exponentially, through the effectively calculation method using the nucleic acid and protein sequence derive the structure and function of them, especially the sequence of DNA identification in the genes encoding them prediction problem is urgent to one of the research project.This paper under the study genetic code of special biological sense of CpG island, determines the three methods to determine the position of the CpG island and on this basis, combining with a new DNA sequence letters vector, Uses the information entropy discrete quantity prediction gene sequence, improve the efficiency of the gene encoding the recognition, and the time of the calculation have significantly reduced.
出处
《生物数学学报》
CSCD
2012年第2期342-348,共7页
Journal of Biomathematics
基金
国家自然科学基金资助项目(10871084)
江南大学理学院青年基金资助项目(20060928)