摘要
目的通过机器学习建立并验证阿尔茨海默病(Alzheimer′s disease,AD)发病机制中双硫死亡相关基因的表达模式和诊断性生物标志物。方法从GEO数据库下载GSE33000作为训练数据集,提取双硫死亡相关基因进行分析。通过免疫浸润和GSVA富集分析,比较AD患者和健康对照之间的差异表达基因在不同免疫细胞中的表达情况及其生物学功能。利用共识聚类方法将AD患者分为两个亚组,并对AD组与健康对照组及AD分型亚组进行加权基因共表达网络分析(WGCNA),将两个结果的交集基因作为AD特征基因。通过随机森林模型(RF)和支持向量机模型(SVM)、极限梯度提升算法(XGB)模型和广义线性模型(GLM)构建训练模型,筛选出最相关的5个基因作为诊断性标志物,并在GSE122063数据集中进行验证。结果在文献中已证实的24个双硫死亡相关基因中,有22个基因在AD发病过程中显著差异表达。免疫浸润分析发现浆细胞、CD8^(+)T细胞、单核细胞可能在双硫死亡调控AD中发挥重要作用。GSVA富集分析结果表明:对比于C1亚组,C2亚组中双硫死亡相关差异表达基因在亨廷顿病、帕金森病和阿尔茨海默病中上调。通过共识聚类方法将AD基因分为两个亚组(C1和C2),通过WGCNA识别显著模块并将结果取交集后获得63个AD特征性基因。训练模型结果显示,SVM模型的残差分布最低,ROC曲线下面积(AUC)值最高(0.946)。SVM模型筛选的前5个AD特征基因为PARP10、MAP2K1、PTBP1、PAK1和NMS,并基于此建立AD诊断风险评估列线图。决策曲线和校正曲线分析结果显示该模型预测准确度良好。在GSE122063外部数据集中验证模型准确性,ROC结果显示AUC值为0.788,模型构建成功。结论双硫死亡在AD的发生和诊断中起重要作用,未来可根据双硫死亡相关基因预测并筛选具有潜在治疗AD作用的药物。
Objective To establish and validate the expression patterns and the diagnostic biomarkers of disulfidptosis-related genes in the pathogenesis of Alzheimer′s disease(AD)using machine learning.Methods The GSE33000 dataset was downloaded from the GEO database as the training dataset,and Disulfidptosis-related genes were extracted for analysis.Expressions of differentially expressed genes between AD patients and healthy controls were compared in different immune cells and their biological functions were assessed through immune infiltration and GSVA enrichment analyses.AD patients were divided into two subgroups according to consensus clustering.Characteristic genes between AD patients and healthy controls,and between AD subtypes were identified by weighted gene co-expression network analysis(WGCNA).The intersected genes from these analyses were taken as AD signature genes.Random forest(RF),support vector machine(SVM),eXtreme gradient boosting(XGB),and generalized linear model(GLM)algorithms were employed to construct the training models,and the top five genes were screened as diagnostic biomarkers and then validated in the GSE122063 dataset.Results Of the 24 disulfidptosis-related genes reported in the literature,22 were significantly differentially expressed in the progression of AD.Immune infiltration analysis highlighted the potential roles of plasma cells,CD8^(+)T cells,and monocytes in the process of di-sulfidptosis regulating AD.GSVA enrichment analysis indicated that disulfidptosis-related genes were upregulated in Huntington′s di-sease,Parkinson′s disease,and Alzheimer′s disease in C2 subgroup compared to C1 subgroup.A total of 63 characteristic AD genes were identified by WGCNA.The residual of SVM model was the lowest,with the highest AUC value(0.946).The top five key AD signature genes,PARP10,MAP2K1,PTBP1,PAK1,and NMS,were screened by SVM model,and used to construct an AD diagnostic risk assessment nomogram.Decision curve and calibration curve analyses demonstrated the predictive accuracy of the model was good.In the GSE122063 dataset,the model was confirmed to be accuracy,with an AUC value of 0.788,indicating the successful construction of the model.Conclusion Disulfidptosis plays a crucial role in the occurrence and diagnosis of AD.In the future,disulfidptosis-related genes may be used to predict and screen the potential therapeutic drugs for AD.
作者
侯传东
贺培凤
耿杰
陈浩然
何田田
张辉
李泓毅
张昊军
张力中
赵鹏
张虹
高楚萌
卢学春
HOU Chuandong;HE Peifeng;GENG Jie;CHEN Haoran;HE Tiantian;ZHANG Hui;LI Hongyi;ZHANG Haojun;ZHANG Lizhong;ZHAO Peng;ZHANG Hong;GAO Chumeng;LU Xuechun(Department of Hematology,Second Medical Center of Chinese PLA General Hospital,National Clinical Research Center for Geriatric Diseases,Beijing 100853,China;School of Management,Shanxi Medical University;School of Basic Medical Sciences,Shanxi Medical University;Medical Research Institute,Shanxi Medical University;Department of Respiratory and Critical Care Medicine,Second Medical Center of Chinese PLA General Hospital;Fuxing Road Outpatient Department,Jingnan Medical District,Chinese PLA General Hospital)
出处
《山西医科大学学报》
CAS
2024年第8期1032-1041,共10页
Journal of Shanxi Medical University
基金
军队后勤科研项目保健专项课题(23BJZ25)
国家老年疾病临床医学研究中心多中心RCT临床研究项目(NCRCG-PLAGH-20230010)。