摘要
提出一种有别于系统发育树的根据16S rRNA基因序列进行物种分类的新方法。首先将基因的碱基字母形式转换成数字形式,构建多维向量。然后根据主成分分析方法将该向量向数据分布最大方向投影,将原数据用几个"主成分"线性表出,而不丢失原数据的信息,采用主成分的显示功能作出三维主成分特征投影视图,达到分类的目的。在双歧杆菌和肠球菌的分类识别中得到较好的应用。
This paper proposed a new species taxonomy method that differs from the phylogenetic tree based on 16S rRNA gene sequences. Firstly, the letter pattern of bases was converted into digital pattern, to build a multi-dimensional vector. Then according to the method of principal component analysis (PCA) the vector was projected into the maximum direction of database distribution, express the original data into " principal component" linearity, without loss of information of the original data, adopting the display function of principal component to make projection view of the principal component characteristics in three dimentions, to meet the goal of taxonomy. It had been fairly well applied in taxonomical recognition of Bifidobacterium and Enterococcus.
出处
《微生物学杂志》
CAS
CSCD
2015年第6期105-108,共4页
Journal of Microbiology
基金
山东省优秀中青年科学家科研奖励基金项目(BS2011SW005)
山东省科技公关基金项目(2007GG3WZ05009)
关键词
物种分类
碱基序列
向量
主成分分析
species taxonomy
base sequence
vector
principal component analysis