期刊文献+

基于扩展起始节点和加权融合策略预测肺癌风险致病基因 被引量:3

Uncovering Lung Cancer Risk Pathogenic Genes With Expanded Initial Node and Weighted Fusion Strategy
下载PDF
导出
摘要 肺癌风险致病基因预测有助于了解疾病发病机制、提高临床治疗效果.目前,以重启游走为框架的风险致病基因预测算法,普遍存在起始节点少、节点转移概率相同、信息源单一的问题.为此,本文提出一种基于扩展起始节点和加权融合策略的风险致病基因预测算法(命名为AFMFSC),并在肺癌中验证算法有效性.首先,基于增广模糊测量思想,计算疾病表型近似基因间的增广功能相似得分,从中选出重要基因与致病基因作为扩展起始节点;其次,采用节点拓扑相似度转移矩阵及基因表达差异相关性转移矩阵,分别在蛋白质网络中重启随机游走,并将两种结果加权融合排序;最后,通过富集分析排名靠前基因,得到有显著意义的风险致病基因.AFMFSC算法预测的73个肺癌风险致病基因,均与肺癌发生、发展有密切联系,生物学意义显著.与其他排序算法相比,AFMFSC算法的Top 1%、Top 5%和AUC值比较大,平均排名和受拓扑特性偏差影响程度小;融合策略排名性能优于单一转移矩阵或普通邻接矩阵游走排名.AFMFSC算法不仅能准确有效地预测肺癌风险致病基因,而且可推广预测其他疾病风险致病基因,为探索癌症致病机理提供新视角及依据. The identification of risk pathogenic genes for lung cancer is helpful to understand disease pathogenesis and improve clinical practice. However, the present predicting methods of using RWR framework include the common problems of the less initial nodes, the same node transition probability, and the single information source. To further improve the performance of RWR framework, we propose a novel method named AFMFSC to identify disease-related genes, by enlarging the initial nodes and weighted fusion strategy, and use lung cancer as the test object. The AFMFSC algorithm first computes the augmented functional similarity scores between disease phenotype approximate genes based on the idea of augmenting fuzzy measure similarity, screens important genes as the expanded initial nodes together with pathogenic genes, then walks in the global PPI network separately guided by the node similarity transition matrix constructed with PPI network topological similarity properties and the correlational transition matrix constructed with the gene expression profiles, all the genes in the network are ranked by weighted fusing the above results guided by two types of transition matrices, at last the top ranked genes in the enrichment analysis as final risk pathogenic genes are determined. 73 significant genes are predicted to be the risk pathogenic genes for lung cancer, which are closely linked with the generation and development of this disease. Compared with the existing methods for prioritizing potential risk disease genes, the AFMFSC achieves a smaller average rank and less affect by degree distribution bias but bigger Top 1%,Top 5%and AUC value. In addition, the ranking performance of fusion strategy outperforms a single transfer matrix or ordinary adjacency matrix. The AFMFSC algorithm not only can accurately and effectively predict the risk pathogenic genes of lung cancer, but also can be easily extended to identify any other diseases related genes, and provide additional insights for exploring the pathogenesis of cancer.
出处 《生物化学与生物物理进展》 SCIE CAS CSCD 北大核心 2016年第2期176-186,共11页 Progress In Biochemistry and Biophysics
基金 国家自然科学基金(91430111,61473232,61170134),国家自然科学基金青年基金(61502396)资助项目 互联网金融创新及监管四川省协同创新中心资助项目
关键词 风险致病基因 扩展起始节点 拓扑相似度转移矩阵 基因表达差异相关性转移矩阵 重启随机游走 risk pathogenic gene expanded initial node topological similarity transition matrix gene expression difference correlational transition matrix random walk with restart
  • 相关文献

参考文献21

  • 1DttPage M, Jacks T. Genetically engineered mouse models of cancer reveal new insights about the antitumor immune response. Current Opinion in Immunology, 2013, 25(2): 192-199.
  • 2Siegel R, Ma J, Zou Z, et al. Cancer statistics, 2014. CA: a cancer journal for clinicians, 2014, 64(1): 9-29.
  • 3e Zahra S N, Khattak N A, Mir A. Comparative modeling and docking studies of pl6ink4/Cyclin D1/Rb pathway genes in lung cancer revealed functionally interactive residue of RB1 and its functional partner E2F1. Theoretical Biology and Medical Modelling, 2013,10(1): 1.
  • 4Ktihler S, Bauer S, Horn D, et al. Walking the interactome for pdoritization of candidate disease genes. The American Journal of Human Genetics, 2008, 82(4): 949-958.
  • 5Vanunu O, Magger O, Ruppin E, et ol. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol, 6(1): e1000641(DOI: 10.1371/joumal.pcbi.1000641).
  • 6Le D H, Kwon Y K. Neighbor-favoring weight reinforcement to improve random walk-based disease gene prioritization. Computational Biology and Chemistry, 2013, 44(01): 1-8.
  • 7Guo X, Gao L, Wei C, eta/. A computational method based on the integration of heterogeneous networks for predicting disease-gene associations. PIoS one, 2011, 6(9): e34171.
  • 8Xie M, Hwang T, Kuang R. Reconstructing disease phenome- genome association by bi-random walk. Bioinformatics, 2012, 1(02): 1-8.
  • 9Navlakha S, Kingsford C. The power of protein interaction networks for associating genes with diseases. Bioinformatics, 2010, 26(8): 1057-1063.
  • 10Li Y, Patra J C. Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. Bioinformatics, 2010, 26(9): 1219-1224.

二级参考文献36

  • 1Wood LD, Parsons DW, Jones S, Lin J, Sj6blom T, Leary R J, Shen D, Boca SM, Barber T, Ptak J. The genomic landscapes of human breast and colorectal cancers. Science, 2007, 318(5853): 1108-1113.
  • 2Lim J, Hao T, Shaw C, Patel A J, Szab6 G, Rual JF, Fisk C J, Li N, Smolyar A, Hill DE, A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell, 2006, 125(4): 801-814.
  • 3Van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JA. A text-mining analysis of the human phenome. Eur J Hum Genet, 2006, 14(5): 535-542.
  • 4K6hler S, Bauer S, Hom D, Robinson PN. Walking the interactome for pdoritization of candidate disease genes. Am J Hum Genet, 2008, 82(4): 949-958.
  • 5Lage K, Karlberg EO, Sterling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, T[- mer Z, Pociot F, Tommerup N. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol, 2007, 25(3): 309-316.
  • 6Wu X, Liu Q, Jiang R. Align human interactome with phenome to identify causative genes and networks underlying disease families. Bioinformatics, 2009, 25(1): 98-104.
  • 7Li Y, Patra JC. Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. Bioinformatics, 2010, 26(9): 1219-1224.
  • 8Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi T, Gronborg M. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res, 2003, 13(10): 2363-2371.
  • 9Hamosh A, Scott AF, Amberger JS, Bocchini CA, Mckusick VA. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res, 2005, 33(Suppl 1): D514-D517.
  • 10Wu X, Jiang R, Zhang MQ, Li S. Network-based global inference of human disease genes. Mol Syst Biol, 2008, 4(1): 189.

共引文献4

同被引文献15

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部