摘要
利用三联体和单联核苷酸的联合概率分布的差异来表示序列之间的差异,提出了一种新的关联特征TBC;对TBC特征矩阵进行平移极差变换,利用指数切比雪夫距离法构建了模糊相似矩阵,采用模糊聚类中的传递闭包法构建进化树。该方法不需要多序列比对,计算简单。对两组基因组序列构建进化树,实验结果验证了该方法的有效性。
This paper proposed a new correlation feature TBC according to the difference of joint probability distribution in trinucleotide and base to show sequences' difference.Used the moving range transformation to TBC representation matrix and formed fuzzy similarity matrix by the exponent Chebyshev distance method.Constructed the phylogenetic tree using transitive closure of fuzzy clustering.The proposed method does not require multiple alignments and is simple in calculation.Phylogenetic trees of 48 hepatitis E virus sequences and 24 complete coronavirus genomes respectively show that the method is efficient.
出处
《计算机应用研究》
CSCD
北大核心
2011年第8期2844-2847,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(60873184)
关键词
基因组
系统发育分析
关联特征
进化树
模糊聚类
genome
phylogenetic analysis
correlation feature
phylogenetic tree
fuzzy clustering