摘要
基因功能的富集分析已成为高通量组学数据分析的常规手段,对于揭示生物医学分子机制具有重要意义.目前已有上百种基因功能富集分析的方法和工具.根据所解决的问题和算法的原理,这些方法可大体分为过代表分析、功能集打分、基于通路拓扑结构和基于网络拓扑结构4大类.本文对这4大类方法的原理及其中的典型方法进行了综述,并讨论了基因功能富集分析结果的冗余性问题及建立标准数据集的必要性.
Gene functional enrichment analysis has become a common procedure in high-throughput omics data analysis and plays a vital role in revealing molecular mechanisms in biomedical sciences. Hundreds of different gene functional enrichment methods and tools have been developed. In accordance with the problems to be solved and the principle of algorithms, these methods can be approximately classified into four categories, including over-representation analysis, functional class scoring, pathway topology, and network topology. In this article, we review the principles of these four main categories and examples of commonly used approaches. We discussed the redundancy in the results of gene functional enrichment analysis and the necessity to build benchmark datasets.
出处
《中国科学:生命科学》
CSCD
北大核心
2016年第4期363-373,共11页
Scientia Sinica(Vitae)
基金
国家自然科学基金(批准号:91231116
31071113
30971643)
国家重点基础研究发展计划(批准号:2012CB316505
2010CB529505)资助
关键词
组学数据
功能富集
冗余性
标准数据集
omics data
functional enrichment
redundancy
benchmark datasets