摘要
密码子用法可以简介为物种、基因组、基因或序列集中64种密码子的使用方式。密码子用法的可视化为研究样本的密码子用法特征、不同样本间的密码子偏好,提供了1种直观、有效的图形化手段。可基于多种参数的对应分析(CA)法是其中1种重要的可视化方法。在本文中,以果蝇、人类、拟南芥、酿酒酵母、线虫、大肠杆菌、枯草芽孢杆菌及除虫链霉菌8个物种的密码子用法数据为例,通过比较密码子实际出现次数(Number)、密码子用法频率(F1000)、密码子用法分值(Fraction)、相对适应度(RA)与相对同义密码子用法(RSCU)5种参数的可视化结果,得到如下初步结论:(1)不论采用5种参数中的哪1种,65维(一维物种+64维密码子)的数据都可通过CA降维为7维,而此时的累计惯量为100%,即完全还原原数据集中的所有信息;(2)采用密码子平均用法(F1000)是1个较好的选择,其前二维的累计惯量可以达到83.6%,但是基于Fraction可以获得更高的累计惯量,达到85.1%,Number、RSCU次之,在考虑计算量的情况下,选择Number也可以获得失真度较小的结果,RA参数的最低,仅有76.9%。
Codon usages are briefly the usage patterns of 64 types of codons in species,genomes,genes,or coding sequence sets.The visualization of codon usage provides one kind of intuitionistic and useful method graphically for researching the codon usage characteristic of samples or their codon bias.The corresponding analysis(CA)based on many kinds of codon parameters,is an important visualization method.This paper,taken eight species which were Drosophila melanogaster,Homo sapiens,Arabidopsis thaliana,Saccharomyces cerevisiae,Caenorhabditis elegans,Escherichia coli,Bacillus subtilis and Streptomyces avermitilis as samples,compared different visualization results by CA based on actual times(Number),codon usage frequency(F1000),codon usage score(Fraction),relative adaptiveness(RA)and relative synonymous codon usage(RSCU),respectively.The result indicated that:(ⅰ)65 dimensions data which were consist of 1 dimension species and 64 codons could be reduced into 7 new dimensions no matter which parameter CA dealing was based on,while the total cumulative inertia was 100%that meant the new dimensions kept all information of the original data,(ⅱ)the codon average usage(F1000)was a good choice as the cumulative inertia of the two fiontal dimensions was 83.6%,but Fraction is the best which was up to 85.1%,and Number and RSCU were a little lower than Fraction;considering the burden of the operation, Number was better than RSCU and Fraction,because their distortion values were similar.RA was the worse which was only 76.9%.
出处
《计算机与应用化学》
CAS
CSCD
北大核心
2011年第6期675-679,共5页
Computers and Applied Chemistry
基金
国家自然科学基金资助项目(21076172)
福建省高校产学合作科技重大项目(2010H6023)
关键词
密码子用法可视化
对应分析
相对适应度
相对同义密码子用法
密码子优化
visualization of codon usage
corresponding analysis
relative adaptiveness
relative synonymous codon usage
codon optimization