We present an integrated stand-alone software package named KaKs_Calculator 2.0 as an updated version. It incorporates 17 methods for the calculation of nonsynonymous and synonymous substitution rates; among them, we ...We present an integrated stand-alone software package named KaKs_Calculator 2.0 as an updated version. It incorporates 17 methods for the calculation of nonsynonymous and synonymous substitution rates; among them, we added our modified versions of several widely used methods as the gamma series including y-NG, y-LWL, ),-MLWL, y-LPB, y-MLPB, y-YN and y-MYN, which have been demonstrated to perform better under certain conditions than their original forms and are not implemented in the previous version. The package is readily used for the identification of positively selected sites based on a sliding window across the sequences of interests in 5' to 3' direction of protein-coding sequences, and have improved the overall performance on sequence analysis for evolution studies. A toolbox, including C++ and Java source code and executable files on both Windows and Linux platforms together with a user instruction, is downloadable from the website for academic purpose at https://sourceforge.net/projects/kakscalculator2/.展开更多
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nu- cleotides--adenine, thymine, guanine and cytosine--according to their emergence in evolution, and apply t...The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nu- cleotides--adenine, thymine, guanine and cytosine--according to their emergence in evolution, and apply the or- ganizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equiva- lent to a content-centric organization of the code and that codon and amino acid usages under different classifica- tion schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.展开更多
基金funded by the National Basic Research Program of China (973 Program) to JY (Grant No.2006CB910404)
文摘We present an integrated stand-alone software package named KaKs_Calculator 2.0 as an updated version. It incorporates 17 methods for the calculation of nonsynonymous and synonymous substitution rates; among them, we added our modified versions of several widely used methods as the gamma series including y-NG, y-LWL, ),-MLWL, y-LPB, y-MLPB, y-YN and y-MYN, which have been demonstrated to perform better under certain conditions than their original forms and are not implemented in the previous version. The package is readily used for the identification of positively selected sites based on a sliding window across the sequences of interests in 5' to 3' direction of protein-coding sequences, and have improved the overall performance on sequence analysis for evolution studies. A toolbox, including C++ and Java source code and executable files on both Windows and Linux platforms together with a user instruction, is downloadable from the website for academic purpose at https://sourceforge.net/projects/kakscalculator2/.
基金supported by a faculty fund from King Abdullah University of Science and Technology (http://www.kaust.edu.sa) awarded to JY
文摘The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nu- cleotides--adenine, thymine, guanine and cytosine--according to their emergence in evolution, and apply the or- ganizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equiva- lent to a content-centric organization of the code and that codon and amino acid usages under different classifica- tion schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.