摘要
为了分析细菌16S rRNA基因可变区与其全长序列之间进化关系的相似性,对核糖体数据库项目(RDP)所提供的细菌16S rRNA基因进行了研究。在对可变区进行截取、筛选等数据预处理后,对可变区实际碱基数目和操作分类单元数目进行了统计分析。结果显示,V2、V3、V4可变区,特别是V2、V4可变区,不仅在序列长度上较长,实际碱基数目也大大超过其他可变区,较其他可变区包含更多的序列信息;建立了层次距离矩阵算法,计算出V2、V3、V4可变区与全长序列所构建的进化树之间的距离差异值分别为59052、87154、45848,可见V4可变区在进化关系上更接近全长序列,使用V4可变区构建进化树的可信度要优于V2、V3可变区,且层次距离矩阵算法比一些传统的距离与相似度算法具有更好的性能。
In order to analyze the evolution relationship similarity between the variable regions and full-length sequence of the bacterial 16S rRNA gene,the data provided by RDP was studied.After data preprocessing such as extraction and screening,the actual base number and operational taxonomic unit number of variable regions were statistically analyzed.The results showed that V2,V3 and V4 variable regions,especially V2 and V4 variable regions,not only had longer sequence length,but also had more actual base number than other variable regions,and contained more sequence information than other variable regions.The hierarchical distance matrix algorithm was established,and the distance different values between the variable region of V2,V3,V4 and the full-length sequence were 59052,87154,45848,respectively.It could be seen that the V4 variable region was closer to the full-length sequence in evolutionary relationship,and the credibility of the evolutionary tree constructed by V4 variable region was better than that of V2 and V3 variable regions.The hierarchical distance matrix algorithm had better performance compared with some traditional distance and similarity algorithms.
作者
刘爽爽
帖云
齐林
刘峰辉
王磊
LIU Shuangshuang;TIE Yun;QI Lin;LIU Fenghui;WANG Lei(School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China;The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, China;Center for Dentistry,Henan Provincial People′s Hospital, Zhengzhou 450003, China)
出处
《郑州大学学报(理学版)》
北大核心
2022年第1期19-24,共6页
Journal of Zhengzhou University:Natural Science Edition
基金
国家自然科学基金河南联合项目(U1804152)。
关键词
16S
rRNA
可变区
操作分类单元
进化树
进化关系
16S rRNA
variable region
operational taxonomic unit
evolutionary tree
evolutionary relationship