摘要
目的探寻生物进化中序列正向选择位点筛选的最佳方法。方法以流感病毒H3亚型HA1序列的进化树分析结果作为产生模拟序列的进化模式,按照不同的进化参数模拟序列数为8、16、32、64、128的样本,分别采用计数模型、固定效应似然比检验模型及随机效应模型筛选模拟序列的正向选择位点。结果随着分析序列数的增多、进化强度的增强,三种方法筛选正向选择位点的灵敏度均增高;同序列数、同进化强度的筛选中,计数模型、随机效应模型的灵敏度显著低于固定效应似然比检验模型;三种模型的运算速度由快至慢分别为计数模型、固定效应似然比检验模型、随机效应模型。结论对于高进化速度、大序列样本的流感病毒正向选择位点筛选,固定效应模型比其他两类方法更加适宜。
Objective To evaluate the performance of three methods in identifying positive selected sites. Methods Data with 8, 16, 32, 64,128 sequences were simulated according to the topologies of phylogenetic tree of influenza A HA1 sequence, then all these simulated data was analyzed by counting method, fixed effects likelihood method and random effects hkelihood approach for identifying positive selected sites. Results The larger the number and selection power of sequence was, the higher the sensitivity of three methods was. The sensitivity of fixed effects likelihood method was much higher than that of counting method and random effects likelihood approach when data with the same parameter was analyzed. Counting method is computationally faster than the others and random effects likelihood approach is the slowest. Conclusion It seems that fixed effects likelihood method is more powerful in analyzing influenza data, characterized by high variation and large scale sequences, for identifying positive selected sites.
出处
《中国卫生统计》
CSCD
北大核心
2007年第5期476-479,共4页
Chinese Journal of Health Statistics
基金
国家自然科学基金资助项目(30400370)
关键词
生物进化
正向选择位点
计数模型
固定效应似然比检验模型
随机效应模型
Evolution
Positive selected sites
Counting method
Fixed effects likelihood method
Random effects likelihood approach