Deducing the function of certain sites within a protein necessitates a priori recognition of the strength of selective pressure. Currently, statistical method is the only option to evaluate the degree of conservation....Deducing the function of certain sites within a protein necessitates a priori recognition of the strength of selective pressure. Currently, statistical method is the only option to evaluate the degree of conservation. In the statistical framework, the types of selective pressure can be divided into classifications of negative, nearly neutral and positive. However, such quantitative methods may omit some crucial amino acid sites among the nearly neutral results. In this study, we propose that the cladistic information can be also important to evaluate the functional importance of various amino acid sites. The ribosomal proteins of 62 eukaryotic species were chosen as the case for statistical and cladistic analysis. The evolutionary changes of each site in the aligned sequences were matched on a currently well-accepted cladogram of eukaryotes. Hundreds of synapomorphic sites were discovered in various clades, in which only part of them were suggested to be potentially significant in the statistical framework. Notably, the mutation on His213 of RPL10 in human beings, which are synapomorphic in vertebrates but only be identified as being under neutral selection, is account for the disease Autism. Therefore, the cladistic information can be complementary to the statistical framework in understanding lineage-specific selection event. Additionally, the bias in the accumulation of apomorphic amino acids is significant when going from the Chordata to the Mammalia lineages. This study emphasizes the value of analyzing transcriptomic and proteomic data in a cladistic way to recognize the presence of group-specific selection on various sites in proteins.展开更多
基金supported by the National Natural Science Foundation of China(31222051,J1210005)
文摘Deducing the function of certain sites within a protein necessitates a priori recognition of the strength of selective pressure. Currently, statistical method is the only option to evaluate the degree of conservation. In the statistical framework, the types of selective pressure can be divided into classifications of negative, nearly neutral and positive. However, such quantitative methods may omit some crucial amino acid sites among the nearly neutral results. In this study, we propose that the cladistic information can be also important to evaluate the functional importance of various amino acid sites. The ribosomal proteins of 62 eukaryotic species were chosen as the case for statistical and cladistic analysis. The evolutionary changes of each site in the aligned sequences were matched on a currently well-accepted cladogram of eukaryotes. Hundreds of synapomorphic sites were discovered in various clades, in which only part of them were suggested to be potentially significant in the statistical framework. Notably, the mutation on His213 of RPL10 in human beings, which are synapomorphic in vertebrates but only be identified as being under neutral selection, is account for the disease Autism. Therefore, the cladistic information can be complementary to the statistical framework in understanding lineage-specific selection event. Additionally, the bias in the accumulation of apomorphic amino acids is significant when going from the Chordata to the Mammalia lineages. This study emphasizes the value of analyzing transcriptomic and proteomic data in a cladistic way to recognize the presence of group-specific selection on various sites in proteins.