摘要
在30个物种中分别随机选取长度为20万碱基的序列片断,通过方差分析和多重比较检验考察其多项信息参数作为基因组识别码的有效性.结果表明:在16种偏信息关联中,FA(k)C,FC(k)A,FG(k)T和FT(k)G的识别能力最强.这说明基因组签名偏好AC和GT语言.并且将信息关联与上述四种偏信息关联结合可大幅提高识别能力.这表明信息关联和偏信息关联是标识基因组的有效工具.
Sequences length 200 thousands bases are randomly selected from 30 species,and the efficiency of several information parameters as an identifier is tested by using analysis of variance and multiple comparison tests. Results show that among 16 kinds of partial information correlationsFA(b)C,FC(k)A,FG(k)T and FT(k)G, are more efficient. These indicate that a genomic signature prefers AC and GT modes. Combining four information correlations with the upper four kinds of partial information correlations is the most efficient identifier for distinguishing genomes. This means that information correlation and partial information correlation can serve as a powerful tool for signifying genomes.
出处
《内蒙古大学学报(自然科学版)》
CAS
CSCD
北大核心
2011年第1期62-68,共7页
Journal of Inner Mongolia University:Natural Science Edition
基金
国家自然科学基金资助项目(90403010)
关键词
信息关联
偏信息关联
基因组签名
识别码
information correlation
partial information correlation
genomic signature
identifier