期刊文献+

基于改进Spectrum的肠道菌群聚类算法研究

Improved Clustering Algorithm Based on Spectrum for Gut Microbiome
下载PDF
导出
摘要 肠道菌群与诸多人类重大疾病相关,研究在不同条件下的肠道菌群数据具有重要意义。由于菌群数据出现零膨胀现象,采用成对比率几何平均值(GMPR)方法对其进行归一化。本研究以2型糖尿病数据集为例,提出一种改进的Spectrum算法。首先,使用基于特征加权的相似度矩阵,避免忽视每个样本/特征所对应的不同特征值大小在该样本中所占据的权重;其次,将拉普拉斯矩阵替换为Hessian矩阵,避免传统谱聚类的灵敏度问题,将ISODATA聚类算法代替原本的K-means算法,有效地调整聚类中心数K。试验结果表明,GMPR+改进Spectrum在2型糖尿病中的标准化互信息(NMI)为0.423,戴维森堡丁指数(DBI)为4.751,Calinski-Harabasz指标(CH)为25.541,兰德指数(RI)为0.835,调整兰德指数(ARI)为0.019,较改进前的效果有所提升,并且该算法可以识别出不同类型患病人群在肠道菌群上的结构差异,挖掘出肠道微生物组的关键细菌。 Gut microbiome is related to many major human diseases,and it is of great significance to study the differences in the structure of gut microbiome under different conditions.Due to the phenomenon of zero expansion of the flora data,geometric mean of pairwise ratios(GMPR)was firstly used to normalize the gut microbiome data.This study proposes an improved Spectrum algorithm using the type 2 diabetes mellitus(T2DM)dataset being taken as example.Firstly,a similarity matrix based on feature weighting was used,which can avoid ignoring the weights occupied by the different eigenvalue sizes corresponding to each sample/feature in that sample;secondly,the Laplacian matrix was replaced by the Hessian matrix,which can avoid the sensitivity problem of traditional spectral clustering;the original K-means algorithm was replaced by the ISODATA clustering algorithm,the number of clustering centers K can be effectively adjusted.The experimental results showed that normalized mutual information(NMI)is 0.423,Davies-Boulding index(DBI)is 4.751,the Calinski-Harabasz index(CH)is 25.541,Rand index(RI)is 0.835 and the adjusted Rand index(ARI)is 0.019,which was improved compared with the effect before the improvement,and the algorithm could identify the structural differences in the intestinal flora of different types of patients,unearth the key bacteria of the gut microbiome.
作者 任玉艳 熊馨 贺建峰 REN Yuyan;XIONG Xin;HE Jianfeng(School of Nursing and Health,Zhejiang Changzheng Vocational and Technical College,Hangzhou 310012,China;Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处 《激光生物学报》 CAS 2022年第5期440-449,共10页 Acta Laser Biology Sinica
基金 国家自然科学基金地区项目(82060329) 云南省科技厅面上项目(202201AT070108)。
关键词 肠道菌群 相似度矩阵 拉普拉斯矩阵 聚类 2型糖尿病 gut microbiome similarity matrix Laplacian matrix cluster type 2 diabetes mellitus
  • 相关文献

参考文献8

二级参考文献116

共引文献1235

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部