摘要
目的基于生物信息学和机器学习算法探索帕金森病(PD)诊断的生物标志物及其与免疫浸润的相关性。方法选择基因表达综合数据库(GEO)中的GSE20164、GSE20314、GSE20333和GSE24378数据集进行分析,筛选PD患者和健康对照者大脑黑质中的差异表达基因。采用GO富集分析、KEGG通路富集分析、LASSO逻辑回归算法和随机森林算法筛选枢纽基因,并计算枢纽基因诊断PD的受试者工作特征(ROC)曲线下面积(AUC)。采用RNA转录相关子集进行细胞类型识别(CIBERSORTx)评估PD患者中22种免疫细胞的浸润特性。结果共筛出20个与PD相关的差异表达基因,包括5个高表达差异基因和15个低表达差异基因。GO富集分析和KEGG通路富集分析结果显示,20个差异表达基因涉及多巴胺生物合成、胺类生物合成、对毒物反应、酪氨酸代谢、多巴胺能突触、PD、突触囊泡循环等方面。LASSO逻辑回归算法和随机森林算法筛选出KCNMB3、SDC1和EPYC 3个诊断枢纽基因。ROC曲线分析显示,3个枢纽基因综合诊断PD的AUC为0.783。免疫浸润分析显示,PD组中的幼稚B细胞、单核细胞比例高于健康对照组,差异有统计学意义(P<0.05);幼稚NK细胞与激活的CD4+T细胞呈正相关(P<0.05)。结论通过LASSO算法和随机森林算法筛选出的KCNMB3、SDC1和EPYC枢纽基因在PD的诊断中展现出良好的效能。
Objective To explore biomarkers for the diagnosis of Parkinson disease(PD)and theircorrelation with immune infiltration based on bioinformatics and machine learning algorithms.Methods TheGSE20164,GSE20314,GSE20333,and GSE24378 datasets from the Gene Expression Omnibus(GEO)were selected for analysis to screen for differentially expressed genes in the substantia nigra of PD patientsand healthy controls.Gene Ontology(GO)enrichment analysis,Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analysis,LASSO Logistic regression algorithm,and random forest algorithm wereused to screen hub genes,and the area under the receiver operating characteristic(ROC)curve(AUC)of hubgenes for diagnosing PD was calculated.CIBERSORTx was used to evaluate the infiltration characteristics of22 immune cells in PD patients.Results A total of 20 differentially expressed genes related to PD werescreened,including 5 upregulated genes and 15 downregulated genes.GO enrichment analysis and KEGGpathway enrichment analysis showed that 20 differentially expressed genes were involved in dopaminebiosynthesis,amine biosynthesis,toxin response,tyrosine metabolism,dopaminergic synapses,PD,synapticvesicle circulation,and other aspects.LASSO Logistic regression algorithm and random forest algorithmscreened out three diagnostic hub genes,KCNMB3,SDC1,and EPYC.The ROC curve analysis showed that the AUC for the comprehensive diagnosis of PD by the three hub genes was 0.783.Immune infiltration analysisshowed that the proportion of immature B cells and monocytes in the PD group was higher than that in thehealthy control group,and the difference was statistically significant(P<0.05).There is a positive correlationbetween immature NK cells and activated CD4+T cells,and the difference was statistically significant(P<0.05).Conclusions The KCNMB3,SDC1,and EPYC hub genes screened through LASSO algorithm and randomforest algorithm show good performance in the diagnosis of PD.
作者
王子豪
夏欢
冯婷婷
张明洋
杨新玲
Wang Zihao;Xia Huan;Feng Tingting;Zhang Mingyang;Yang Xinling(Key Laboratory,the Second Affiliated Hospital of Xinjiang Medical University,Urumqi 830000,China;Department of Neurology,the Second Affiliated Hospital of XinjiangMedical University,Urumqi 830000,China)
出处
《神经疾病与精神卫生》
2023年第12期837-847,共11页
Journal of Neuroscience and Mental Health
基金
国家自然科学基金(81960243)
中央引导地方科技发展专项资金项目(ZYD2022C17)。
关键词
帕金森病
生物信息学
黑质
机器学习
免疫浸润
Parkinson disease
Computational biology
Nigra
Machine learning
Immuneinfiltration