摘要
研究影响癌症性状的hub基因时存在如下问题:仅关注强相关性基因进行基因信息处理,缺少对弱相关性基因和不同基因模块间共表达性的研究;仅采用度中心性判断hub基因进行分析基因网络,对蕴含数据挖掘不够全面.本文提出基因模块标签信息游走的图嵌入算法Gene2vec.选取合适软阈值,保留更多弱相关性的基因信息.联合不同种类但与性状高度正相关性的基因模块,构成基因模块共表达网络.针对传统加权基因共表达网络分析方法与图嵌入方法挖掘基因模块网络信息存在的问题,利用标签参数与其他参数调节基因模块网络中的随机游走过程,分析游走生成的节点序列以挖掘基因网络的信息.实验表明,Gene2vec在hub基因的检出率上优于其他算法,得到的hub基因在癌症性状中的基因表达量高于常用生物学方法得到的hub基因.
The research on hub genes affecting cancer traits has such problems:only focusing on strong correlation genes for gene information processing,lack of the co-expression of weak correlation genes and different gene modules;only using degree centrality to judge hub genes to analyze gene network,not comprehensive enough for implicit data mining.This paper proposes the graph embedding algorithm Gene2vec based on information walk with gene module label.The appropriate soft threshold to retain more weakly correlated gene information is selected.The gene module co-expression network is formed by combining different kinds of gene modules with high positive correlation traits.Aiming to solve the problems of mining gene module network information by traditional weighted gene co-expression network analysis method and graph embedding method,the paper adjust the random walk process in the gene module network by label parameters and other parameters and analyze the node sequence generated random walk to mine the gene network information.Experi⁃ments show that Gene2vec is better than other algorithms in the hub gene's detection rate,and the hub gene expression in cancer traits is higher than that of the hub gene obtained by common biological methods.
作者
初妍
戚书豪
张薇
王瀚麟
李松
CHU Yan;QI Shu-hao;ZHANG Wei;WANG Han-lin;LI Song(College of Computer Science and Technology,Harbin Engineering University,Harbin,Heilongjiang 150001,China;School of Computer Science and Technology,Heilongjiang University,Harbin,Heilongjiang 150080,China;College of Underwater Acoustic Engineering,Harbin Engineering University,Harbin,Heilongjiang 150001,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2023年第10期2866-2873,共8页
Acta Electronica Sinica
基金
黑龙江省属高校基本科研业务费基础研究项目(No.2021-KYYWF-0043)。