摘要
目的组学数据信息多样且体量庞大,变量间关系错综复杂。相关分析有助于在海量数据间找到有效关联对,是转化医学和系统生物学研究中常用手段之一。元基因组学和代谢组学2大组学平台由于具备整体系统性分析的功能,广泛应用到了菌群和代谢物的相关研究中。元基因组学和代谢组学数据的来源、结构和特点各不相同,需科学选取相关分析方法进行高质量跨组学研究。方法选取4种典型的相关分析方法(2种经典方法和2种元基因组数据专用方法),设计仿真数据集和实验数据集,对各方法的性能进行测试和比较。结果仿真和真实数据结果显示,CCLasso的相关系数最小,误差百分比最大,所找到的相关对数目最少;Spar CC的结果与CCLasso相反;Pearson与Spearman结果介于两者之间,较为中立。结论对于元基因组学与代谢组学数据的相关分析,CCLasso方法较为严格,易得到假阴性结果;Spar CC方法较为宽松,易得到假阳性结果;Pearson和Spearman结果介于两者之间。建议研究者结合研究目标和侧重点确定具体方法。
ObjectiveHighthroughout omics data with massive data size contains diverse information, and the relationships among variables are complex. Correlation analysis is one of the effective tools for translational medicine and systems biology study and is helpful for digging out valid correlation pairs from big data. Microbiome and metabolomics platform which equipped with integral systematic function are widely used in the association analysis between microbiota and metabolites. Considering the data sources, structures and characteristics are all different between microbiome data and metabolomics data, scientific correlation method selection is needed for high quality crossomics researches. MethodsIn this paper, four typical correlation analysis methods were selected (two classic methods and two specific analysis methods designed for compositional data) and the performance of all methods were tested and compared using simulated and real datasets. ResultsResults of simulated and real datasets suggested that correlation coefficient computed by CCLasso was minimum, its percentage error was maximum, and the number of correlated pairs found by CCLasso was least. On the contrary, results of SparCC were opposite to those of CCLasso. Pearson and Spearman performed between CCLasso and SparCC. ConclusionFor the correlation analysis between metabolomic and microbiome data, CCLasso is more stringent than the others and prone to provide falsenegative results easily. SparCC is looser and prone to achieve falsepositive results. The error risks of Pearson and Spearman are between CCLasso and SparCC. Both aim and emphasis should be considered for researchers with a suitable method selection.
作者
游懿君
梁丹丹
陈天璐
YOU Yijun,LIANG Dandan,CHEN Tianlu(Center lor Translational Medicine, Shanghai Sixth People,s Hospital, Shanghai Jiaotong University,Shanghai 200233,Chin)
出处
《转化医学杂志》
2018年第2期93-96,共4页
Translational Medicine Journal
基金
国家自然科学基金项目(31501079
31500954
81772530)
上海交通大学附属第六人民医院院内预研(2017)
关键词
相关分析
代谢组学
元基因组学
转化医学
Correlation analysis
Metabolomics
Microbiome
Translational medicine