摘要
为描述基因组中k tuple(k字)频数分布的特征,定义了4种信息量,对若干典型基因组序列进行了统计分析,研究了信息量与字长k的普适关系,解释了普适性的根源在于DNA序列的近中性进化,指出真正的挑战在于对特异字的探索,对特异字在DNA RNA 蛋白质相互作用信息系统中的含义做了初步讨论。
To describe the character of the distribution of k-tuple frequency in genomes, four kinds of information quantity are defined.By the statistical analysis of DNA sequences of 16 typical genomes the universal relation between k-tuple information-entropy and word length k is deduced.It is suggested that the universality is related to the neutral mutation-random drift of molecular evolution.The conserved over- or under-represented oligo-nucleotide fragment is defined as a specific word.The implication of these specific words in the information network of DNA-, RNA- and protein-interaction is discussed briefly.
出处
《合肥学院学报(自然科学版)》
2005年第1期1-6,31,共7页
Journal of Hefei University :Natural Sciences