摘要
通过检测突变驱动通路研究癌症的发病机理是当前癌症基础性研究的关键问题之一.该研究以人类基因组工程提供的体细胞突变数据为研究对象,结合基因组图谱中广泛存在的互斥性原理,提出一种新型的基于基因互斥网络的致癌突变驱动通路检测算法(Megnet).该算法首先利用大量癌症病人的体细胞突变数据,结合基因间互斥性原理构建突变基因网络,然后检测该网络中具有高覆盖的最大完全子图.为验证算法的效率和鲁棒性,我们将该算法应用于模拟数据中,结果显示所有模拟过程均在15秒内完成驱动通路检测,Megnet算法比Dendrix和Multi-Dendrix算法运行时间更短且结果准确率更高.同时为验证算法的有效性,我们将该算法应用于肺癌数据和神经胶质瘤体细胞突变数据中,结果显示Megnet算法不仅比Dendrix和Multi-Dendrix算法检测的基因集合具有更高的生物相关性和统计显著性,而且还检测出一些可供生物验证的新候选基因集合,并且这些检测的基因集合与已知的P53、RB、RAS和PI3K等信号通路及细胞循环和细胞凋亡通路具有较高的重叠.Megnet算法不需要指定通路中的基因个数和任何先验知识,为癌症发病机理研究提供新视野.该算法通过构建突变基因网络,简化了基因间相互关联关系,降低了算法复杂度,提高了致癌突变驱动通路检测的效率和准确性,对于癌症发病机理研究具有较强的理论意义和实践价值.
Recent genome sequencing studies have shown that somatic mutations drive cancer development across a large number of genes.One of the key issues of current basic research on cancer is to study the pathogenesis of cancer by detecting mutated driver pathways.This study utilizes the widely existing property of mutual exclusivity in cancer genomic spectrum,examines somatic mutation data provided by the Human Genome Projects,and proposes a novel algorithm(Megnet)for detecting mutated driver pathways on the basis of mutually exclusive gene networks.This study first constructs mutated gene networks based on mutual exclusivity between each pair of genes utilizing somatic mutation data from many cancer patients and then detects the largest complete subgraphs with high coverage.The genes in the largest complete subgraph are recurrently altered across a majority of tumor samples;they are known to or are likely to take part in the same biological process;and mutations of genes within one largest complete subgraph are mutually exclusive.To evaluate the efficiency and robustness of Megnet method,we apply it to simulated data with 300 samples and 1000 genes in which 15 mutually exclusive gene sets with differentcoverage degree are embedded into,and the results indicate that Megnet algorithm finishes the process of detecting driver pathways in 15 seconds in all simulations and has shorter runtime and higher accuracy than those of Dendrix and Multi-Dendrix.To further verify the efficacy of our algorithm,we apply it to somatic mutation data from the mutation profiles of lung carcinoma and glioblastoma tumor samples from the Cancer Genome Atlas(TCGA),and results show that Megnet can not only detect more biologically relevant and higher statistically significant gene sets than those of Dendrix and Multi-Dendrix,but also identify some new candidate gene sets for biological verification,which have a high degree of overlap with the known signaling pathways(like P53,RB,RAS and PI3 K),cell cycle and cell apoptosis pathway.Since somatic mutations are hypothesized to target a small number of cellular signaling and regulatory pathways,a common method is to appraise whether the known pathways are enriched for the mutated gene sets.In glioblastoma multiforme cancer,we make the novel observation that TP53 alteration,copy-number amplification of MDM2 and MDM4 are mutually exclusive,and RB1 deletion,loss of CDK4 and CDKN2 Bare also mutually exclusive,suggesting distinct alternative causes of genomic instability in the cancer type.Overall,we develop a simple,fast and sensitive method for automatically detecting driver pathways in tumors based on mutually exclusive mutational patterns.Megnet algorithm does not need to assign the number of genes in a driver pathway neither requires any prior knowledge,thus providing insights into the research on the pathogenesis of cancer.Based on constructing mutated gene networks,the algorithm simplifies the mutual relationship between genes,reduces the algorithm complexity,and improves the efficiency and accuracy of detecting mutated driver pathways;therefore,it has high theoretical and practical value to research on the pathogenesis of cancer.
作者
吴昊
WU Hao(College of Information Engineering,Northwest A&F University,Yangling,Shaanxi 712100)
出处
《计算机学报》
EI
CSCD
北大核心
2018年第6期1400-1414,共15页
Chinese Journal of Computers
基金
中央高校基本科研业务费(2452017342)
博士科研启动费(2452017019)
陕西省自然科学基金面上项目(2017JM6063)
国家自然科学基金重点项目(61532014
61432010)
陕西省杨凌区科技计划项目(2017GY-03)资助~~