摘要
[目的]提取烟草坏死病毒完整基因组的统计特征。并且对其进行聚类分析。[方法]在烟草坏死病毒完整基因组的碱基序列上,用每个碱基及其随后2个碱基所构成的三碱基组排列成一个新的序列S;计算所有64种不同三碱基组在S上出现的概率,得到一个64维向量L;比较各个基因组的L向量,得到8个三碱基组,它们的概率存在明显差异。[结果]8个三碱基组(CGT;CAA;TAC;TAA;AGG;ATG;ATA;AAG)的出现概率与烟草坏死病毒基因组的遗传变异有着重要关联;4个不同来源的烟草坏死病毒完整基因组,按其遗传变异结果形成2个大类。[结论]该研究方法普遍适用于各种烟草病毒基因序列的分析,为烟草坏死病毒病的防治提供理论依据。
[Objective] The study aimed to extract the statistical features of complete tobacco necrosis virus genome and conduct clustering analysis. [Method] On the base sequence of complete tobacco necrosis virus genome,by using three base groups which were composed of each base and its subsequent two bases,a new sequence S were arranged; the probability of all 64 kinds of three base groups occurred on S were calculated,a 64-dimensional vector L was obtained; L-vectors of each genome were compared,8 three base groups were obtained,their probabilities had significant differences. [Result] The emergence probability of 8 three base groups (CGT; CAA; TAC; TAA; AGG; ATG; ATA; AAG) had significant association with genetic variation of tobacco necrosis virus genome; according to genetic variation result,4 different sources tobacco necrosis virus complete genome formed 2 major categories. [Conclusion] The study will provide theoretical basis for the control of tobacco necrosis virus.
出处
《安徽农业科学》
CAS
北大核心
2010年第13期6803-6804,共2页
Journal of Anhui Agricultural Sciences
基金
湖北省"十一五"教育科学发展规划(2006B131)
关键词
烟草坏死病毒
三碱基组
概率
K-M聚类
Tobacco necrosis virus
Three base groups
Probability
K-M clustering