摘要
目的:利用生物信息学方法,基于G-四链体(G-quadruplex)相关基因探讨乳腺癌新的分子分型和预后作用,并构建预测乳腺癌患者预后的G-四链体相关基因的风险模型。方法:利用癌症基因组图谱(TCGA)数据库中的乳腺癌转录组数据及临床信息,在PubMed上系统检索G-quadruplex相关研究,筛选纳入G-四链体相关基因数据,利用共识聚类鉴定G-四链体相关基因调控亚型,通过基因集富集分析(GESA)、Cox回归分析和LASSO回归分析等生物信息学方法,构建预测乳腺患者预后的G-四链体相关基因风险评分模型,并进一步检测其预测效能。结果:鉴定出基于G-四链体相关基因调控亚型,并且不同亚型总生存期、基因集通路富集分析和药物IC50上存在显著差异,筛选出8个G-四链体相关基因(RPS9、YBX1、FOS、VIM、RPL10、MYC、TAF15、HNRNPA1)用于构建G-四链体相关基因调控亚群的预测模型。Kaplan-Meier分析结果提示低评分组的结局较高评分组更差(P=0.02),风险模型在预测1年、3年和5年总体生存率的ROC曲线下面积(AUC),训练集中为0.80、0.77和0.78,验证集中为0.81、0.74和0.75。多因素Cox回归分析表明风险评分是乳腺癌患者预后的独立预测因子(P<0.001)。结论:基于G-四链体相关基因鉴定出乳腺癌亚型并构建的风险评分预后模型对乳腺癌有独立预后作用,为乳腺癌的预后预测提供新的见解。
Objective:To explore the new molecular typing and prognostic role of breast cancer based on G-quadruplex-related genes using bioinformatic methods and to construct a risk model of G-quadruplexrelated genes for predicting the prognosis of breast cancer patients.Methods:Breast cancer transcriptome data and clinical information from The Cancer Genome Atlas(TCGA)database were utilized to systematically search for G-quadruplex-related studies on PubMed,screened for the inclusion of G-quadruplex-related gene data,and identified the regulatory subtypes of G-quadruplex-related genes using consensus clustering,which were analyzed by Gene Set Enrichment Analysis(GESA),Cox regression analysis and LASSO regression analysis and other bioinformatics methods to construct a G-quadruplex-related gene risk score model for predicting the prognosis of breast patients,and further its predictive efficacy was tested.Results:G-quadruplex-related gene regulatory subtypes were identified based on G-quadruplex-related genes.There were significant differences in overall survival,gene set pathway enrichment analysis,and drug IC50 among different subtypes,and eight G-quadruplex-related genes(RPS9,YBX1,FOS,VIM,RPL10,MYC,TAF15,and HNRNPA1)were screened out for constructing a G-quadruplex-related gene regulatory subpopulation predictive models.The results of Kaplan-Meier analysis suggested that the low-scoring group had worse outcomes than the high-scoring group(P=0.02),and the area under the ROC curve(AUC)of the risk model for predicting 1-,3-,and 5-year overall survival was 0.80,0.77,and 0.78,respectively,for the training set,and 0.81,0.74,and 0.75,respectively,for the validation set.Multifactorial Cox regression analysis showed that risk score was an independent predictor of prognosis in breast cancer patients(P<0.001).Conclusion:We constructed a risk score prognostic model based on G-quadruplex-related genes and identified breast cancer subtypes,and the prognostic model is valuable in prognostic prediction for breast cancer.
作者
李俊
魏蕾
张京伟
LI Jun;WEI Lei;ZHANG Jingwei(Dept.of Thyroid and Breast Surgery,Zhongnan Hospital of Wuhan University,Wuhan 430071,Hubei,China;Dept.of Pathophysiology,Taikang Medical College(College of Basic Medical Sciences),Wuhan University,Wuhan 430071,Hubei,China)
出处
《武汉大学学报(医学版)》
CAS
2024年第10期1222-1227,共6页
Medical Journal of Wuhan University