摘要
目的通过生物信息学的方法挖掘肝内胆管癌(ICC)潜在的预后标志物,构建生存预测模型以更好地指导临床治疗。方法通过TCGA-CHOL和GSE107943数据集来寻找差异基因,基于加权相关网络分析(WGCNA)算法,构建无尺度网络寻找和肿瘤发生联系紧密的基因模块。将差异基因与模块内的基因取交集,通过单因素COX和Lasso-cox回归模型构建ICC的预后风险模型,E-MTAB-6389数据集用于外部验证,对于参与模型构建的关键基因,在13个ICC配对肿瘤样本中验证蛋白表达情况。另外收集我院2017年6月—2021年6月80例手术后的ICC患者临床病理资料,电话随访的方式获得患者生存数据。根据免疫组化半定量的方法,对独立预后危险基因的蛋白表达情况进行打分,用二分类的方式将患者分为高、低表达组,比较不同分组患者总生存期(OS)的差异,并分析独立预后危险基因与临床病理特征之间的关系。结果将差异基因与WGCNA中蓝色基因模块中的基因取交集得到958个基因。通过单因素Cox和Lasso-Cox回归分析得到用于模型构建的5个关键基因(CFH、EGR4、RERG、PRICKLE1、NIPA1)。高风险患者OS明显低于低风险的患者。对模型效能进行评估,1、3、5年ROC曲线下的面积分别为0.858、0.881、0.975。校准曲线对列线图进行评价,提示列线图准确性较高,并通过外部队列E-MTAB-6389进行验证,同样说明模型准确性良好。CFH蛋白表达与远处转移、淋巴结转移,以及TNM分期相关,可以作为ICC独立预后危险基因。结论本研究构建的CFH、EGR4、RERG、PRICKLE1、NIPA15基因风险预后模型具有较好的预测效能,能够对ICC患者的预后评估提供参考。
Objective In this research,a potential prognostic marker for intrahepatic cholangiocellular carcinoma(ICC)was explored through a bioinformatics approach,and a survival prediction model was constructed to better guide clinical management.Methods The TCGA-CHOL and GSE107943 datasets were used to search for differential genes,and a scale-free network was constructed based on the Weighted Correlation Network Analysis(WGCNA)algorithm to search for gene modules that were strongly associated with tumourigenesis.The differential genes were intersected with genes within the modules to construct a prognostic risk model for ICC by one-way COX and Lasso-cox regression models,and the E-MTAB-6389 dataset was used for external validation of protein expression in 13 ICC paired tumour samples for the key genes involved in the model construction.Additional clinicopathological data were collected from 80 post-surgical ICC patients at our institution from June 2017 to June 2021,and patient survival data were obtained by telephone follow-up.Protein expression of independent prognostic risk genes was scored according to an immunohistochemical semi-quantitative approach,and patients were classified into high and low expression groups using a dichotomous approach to compare the overall survival of patients in different subgroups,Overall Survival(OS)and the relationship between independent prognostic risk genes and clinicopathological characteristics were analyzed.Results The differential genes were intersected with the genes in the blue gene module in WGCNA to obtain 958 genes.Five key genes were obtained by univariate Cox and Lasso-Cox regression analysis for model construction(CFH,EGR4,RERG,PRICKLE1,NIPA1).OS was significantly lower in high-risk patients than in low-risk patients.The model efficacy was evaluated and the area under the time-dependent ROC curves were:0.858,0.881,and 0.975.The calibration curves were evaluated against the column line plots,suggesting a high accuracy of the column line plots.It was also validated by an external cohort E-MTAB-6389,again indicating good model accuracy.CFH can be an independent prognostic risk gene for ICC,and CFH gene protein expression is associated with distant metastases,lymph node metastases,and TNM staging.Conclusion In this research,a 5-gene prognostic model is constructed,which has good predictive efficacy and can be used as a reference for prognostic assessment of ICC patients.
作者
李俊蒽
胡润
姚沛
桂仁捷
段华新
LI Junen;HU Run;YAO Pei;GUI Renjie;DUAN Huaxin(Department of Oncology,The First Affiliated Hospital of Hunan Normal University,Changsha 410000,China)
出处
《西部医学》
2024年第8期1202-1212,共11页
Medical Journal of West China
基金
湖南省自然科学基金项目(2020JJ8084)。
关键词
肝内胆管细胞癌
生物信息学
预后模型
关键基因
Intrahepatic cholangiocellular carcinoma
Bioinformatic
Prognostic model
Key genes