摘要
目的:筛选骨肉瘤(OS)发生发展的核心驱动基因,从分子水平探讨OS的致病机制,并构建基因模型用于患者生存期的预测。方法:采用基因表达汇编(GEO)数据库下载OS芯片对应矩阵数据GSE12865、GSE14359和GSE36001。采用生物信息学方法筛选OS与正常组织的差异表达基因(DEGs)。通过基因本体论(GO)、京都基因和基因组百科全书(KEGG)分析全面了解DEGs富集的分子功能及通路,采用STRING数据库构建蛋白-蛋白相互作用(PPI)网络,采用Cytoscape软件对DEGs进行相关性分析,找出与OS进展最相关的基因集,明确OS核心致病基因。采用肿瘤基因组图谱(TCGA)数据库下载OS的379个样本相关的临床记录信息和转录组数据,进行Kaplan-Meier(K-M)生存分析以进一步明确和验证核心基因与OS患者预后之间的关系,并寻找性别和种族等与预后相关的因素。对6个基因特征集的表达量进行建模以预测OS患者的生存时间。结果:MCC算法获得的排名前十的DEGs为TYROBP、LAPTM5、FCER1G、CD74、HCLS1、ARHGDIB、HLADPA1、CD93、GIMAP4和LYZ,其表达水平在骨肉瘤患者与正常患者中比较差异有统计学意义(P<0.05)。GO和KEGG分析,DEGs在PI3K-AKT和Notch信号通路显著富集。K-M生存分析,6个基因(ARHGDIB、CD74、FCER1G、HCLS1、HLA-DPA1和TYROBP)表达量更低的OS患者较高表达患者的总生存时间更长(P<0.05)。由该6个基因组成的基因集在预测模型的构建中C指数为0.71。结论:筛选出的OS的核心驱动基因高表达与OS的发生发展相关。OS发生发展的异常信号通路为PI3K-AKT和Notch信号通路。6个核心驱动基因组成OS的特征基因集构建的预测模型有良好的预测能力。
Objective:To screen the core driving genes of the occurrence and development of osteosarcoma(OS),and to explore the pathogenic mechanism of OS at the molecular level as well as to construct the gene model to predict the survival time of the OS patients.Methods:The matrix data of gene chips in OS patients were downloaded from the Gene Expression Omnibus(GEO)database:GSE12865,GSE14359 and GSE36001.The differentially expressed genes(DEGs)between the normal tissue and OS tissue were screened through the bioinformatic method.The molecular functions and pathways of DEGs were comprehensively understood through Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)analysis.The protein-protein interaction(PPI)network was constructed by STRING data,and Cytoscape software was conducted to analyze the correlation between DEGs to identify the most related gene set in the progression of OS as well as to figure out the core pathogenic genes of OS.The clinical record information and transcriptome data of 379 samples of OS were obtained from The Cancer Genome Atlas(TCGA)database,and Kaplan-Meier(K-M)survival analysis was further performed to clarify the relationship between hub genes and survival time of the OS patients,then other factors related to prognosis such as gender and race were searched and discussed.The expression amounts of 6 gene sets were modeled to predict the survival time of the patients.Results:The top ten DEGs analyzed by MCC algorithm were TYROBP,LAPTM5,FCER1 G,CD74,HCLS1,ARHGDIB,HLA-DPA1,CD93,GIMAP4,and LYZ,and the expression difference in these 10 DEGs between OS and normal patients showed statistical significance(P<0.05).The GO and KEGG results revealed that the DEGs were chiefly enriched in PI3 K-AKT and Notch signaling pathways.The K-M survival analysis results demonstrated that the OS patients with lower expressions of 6 genes(ARHGDIB,CD74,FCER1 G,HCLS1,HLA-DPA1,and TYROBP)had longer overall survival time than those with higher expressions(P<0.05).The C-index of the gene set composed of these 6 genes in the construction of prediction model was 0.71.Conclusion:The high expressions of screened core driving genes are correlated with the occurrence and development of OS.The abnormal signaling pathways of occurrence and development of OS are PI3 KAKT and Notch signal pathways.The prediction model constituted by 6 characteristic gene sets of OS possesses a good predictive ability.
作者
李苇航
丁子毅
王栋
潘益凯
刘玉辉
张世磊
李靖
闫铭
LI Weihang;DING Ziyi;WANG Dong;PAN Yikai;LIU Yuhui;ZHANG Shilei;LI Jing;YAN Ming(Department of Orthopaedics,Xijing Hospital,Air Force Medical University,Xi’an 710032,China;Department of Aerospace Medical Training,School of Aerospace Medicine,Air Force Medical University,Xi’an 710032,China;School of Aerospace Medicine,Center of Clinical Aerospace Medicine,Key Laboratory of Aerospace Medicine of Ministry of Education,Air Force Medical University,Xi’an 710032,China)
出处
《吉林大学学报(医学版)》
CAS
CSCD
北大核心
2021年第6期1570-1580,共11页
Journal of Jilin University:Medicine Edition
基金
国家自然科学基金面上项目(82072475)。