Patient-derived tumor xenografts(PDXs)are a powerful tool for drug discovery and screening in cancer.However,current studies have led to little understanding of genotype mismatches in PDXs,leading to massive economic ...Patient-derived tumor xenografts(PDXs)are a powerful tool for drug discovery and screening in cancer.However,current studies have led to little understanding of genotype mismatches in PDXs,leading to massive economic losses.Here,we established PDX models from 53 lung cancer patients with a genotype matching rate of 79.2%(42/53).Furthermore,17 clinicopathological features were examined and input in stepwise logistic regression(LR)models based on the lowest Akaike information criterion(AIC),least absolute shrinkage and selection operator(LASSO)-LR,support vector machine(SVM)recursive feature elimination(SVM-RFE),extreme gradient boosting(XGBoost),gradient boosting and categorical features(Cat Boost),and the synthetic minority oversampling technique(SMOTE).Finally,the performance of all models was evaluated by the accuracy,area under the receiver operating characteristic curve(AUC),and F1 score in 100 testing groups.Two multivariable LR models revealed that age,number of driver gene mutations,epidermal growth factor receptor(EGFR)gene mutations,type of prior chemotherapy,prior tyrosine kinase inhibitor(TKI)therapy,and the source of the sample were powerful predictors.Moreover,Cat Boost(mean accuracy=0.960;mean AUC=0.939;mean F1 score=0.908)and the eight-feature SVM-RFE(mean accuracy=0.950;mean AUC=0.934;mean F1 score=0.903)showed the best performance among the algorithms.Meanwhile,application of the SMOTE improved the predictive capability of most models,except Cat Boost.Based on the SMOTE,the ensemble classifier of single models achieved the highest accuracy(mean=0.975),AUC(mean=0.949),and F1 score(mean=0.938).In conclusion,we established an optimal predictive model to screen lung cancer patients for non-obese diabetic(NOD)/Shi-scid,interleukin-2 receptor(IL-2R)γ^(null)(NOG)/PDX models and offer a general approach for building predictive models.展开更多
目的:基于生物信息学分析NOG基因在胶质瘤中的表达和预后价值,并推断NOG对胶质瘤发生发展的生物学功能。方法:从CGGA和TCGA数据库分别下载并筛选后得到422例和702例胶质瘤的mRNA表达数据集和临床信息数据集,采用非参数秩和检验分析NOG...目的:基于生物信息学分析NOG基因在胶质瘤中的表达和预后价值,并推断NOG对胶质瘤发生发展的生物学功能。方法:从CGGA和TCGA数据库分别下载并筛选后得到422例和702例胶质瘤的mRNA表达数据集和临床信息数据集,采用非参数秩和检验分析NOG的差异表达情况。采用Kaplan-Meier生存曲线法和Cox分析法分析NOG对胶质瘤预后的预测价值。将CGGA和TCGA数据库中NOG与其他基因进行相关分析,筛选出强相关基因后,进行GO和KEGG分析。将NOG与血管生成相关基因集(angiogenesis related gene set,ARGS)进行GSVA分析,与血管生成拟态(vasculogenic mimicry,VM)相关基因进行相关分析。结果:成人胶质瘤WHO分级较低,年龄较小,则NOG表达量较高(P<0.001)。NOG在IDH突变型、1p19q共缺失、MGMT甲基化的组别中显著高表达(P<0.001)。生存分析显示,NOG高表达组的OS显著高于低表达组(P<0.0001)。Cox回归分析显示,TCGA数据中NOG表达水平是OS的独立预后因素[HR=0.395,95%CI(0.236,0.662)]。GO和KEGG分析显示,NOG富集在神经系统发育、成骨细胞分化、血管生成、细胞黏附、整合素、黏着斑、细胞外基质结构、TGF-β信号通路和肿瘤的蛋白聚糖等。GSVA分析提示,NOG表达与血管生成呈负相关关系。相关分析表明,NOG表达与胶质瘤VM相关基因表达呈负相关关系。结论:NOG基因在胶质瘤中存在差异表达,与分子标志物关系密切,是OS的独立预后因素,NOG高表达提示预后良好。NOG可能与神经系统发育、骨骼系统发育、血管生成、细胞迁移、TGF-β信号通路和肿瘤细胞的迁移、侵袭、转移关系密切。NOG可能参与胶质瘤内皮依赖性血管生成的负调控,同时抑制血管生成拟态形成。NOG在胶质瘤发生发展中可能起抗肿瘤作用,可作为胶质瘤的预后指标和潜在的靶向治疗生物标志物。展开更多
基金supported in part by a grant of National Natural Science Foundation of China(81802255)Clinical Research Project of Shanghai Pulmonary Hospital(FKLY20010)+10 种基金Young Talents in Shanghai(2019 QNBJ)"Dream Tutor"Outstanding Young Talents Program(fkyq1901)Clinical Research Project of Shanghai Pulmonary Hospital(FKLY20001)Respiratory Medicine,a key clinical specialty construction project in Shanghai,promotion and application of multidisciplinary collaboration system for pulmonary non infectious diseasesClinical Research Project of Shanghai Pulmonary Hospital(fk18005)Key Discipline in 2019(Oncology)Project of Shanghai Municipal Health Commission(201940192)Scientific Research Project of Shanghai Pulmonary Hospital(fkcx1903)Shanghai Municipal Commission of Health and Family Planning(2017YQ050)Innovation Training Project of SITP of Tongji University,Key Projects of Leading Talent(19411950300)Youth project of hospital management research fund of Shanghai Hospital Association(Q1902037)。
文摘Patient-derived tumor xenografts(PDXs)are a powerful tool for drug discovery and screening in cancer.However,current studies have led to little understanding of genotype mismatches in PDXs,leading to massive economic losses.Here,we established PDX models from 53 lung cancer patients with a genotype matching rate of 79.2%(42/53).Furthermore,17 clinicopathological features were examined and input in stepwise logistic regression(LR)models based on the lowest Akaike information criterion(AIC),least absolute shrinkage and selection operator(LASSO)-LR,support vector machine(SVM)recursive feature elimination(SVM-RFE),extreme gradient boosting(XGBoost),gradient boosting and categorical features(Cat Boost),and the synthetic minority oversampling technique(SMOTE).Finally,the performance of all models was evaluated by the accuracy,area under the receiver operating characteristic curve(AUC),and F1 score in 100 testing groups.Two multivariable LR models revealed that age,number of driver gene mutations,epidermal growth factor receptor(EGFR)gene mutations,type of prior chemotherapy,prior tyrosine kinase inhibitor(TKI)therapy,and the source of the sample were powerful predictors.Moreover,Cat Boost(mean accuracy=0.960;mean AUC=0.939;mean F1 score=0.908)and the eight-feature SVM-RFE(mean accuracy=0.950;mean AUC=0.934;mean F1 score=0.903)showed the best performance among the algorithms.Meanwhile,application of the SMOTE improved the predictive capability of most models,except Cat Boost.Based on the SMOTE,the ensemble classifier of single models achieved the highest accuracy(mean=0.975),AUC(mean=0.949),and F1 score(mean=0.938).In conclusion,we established an optimal predictive model to screen lung cancer patients for non-obese diabetic(NOD)/Shi-scid,interleukin-2 receptor(IL-2R)γ^(null)(NOG)/PDX models and offer a general approach for building predictive models.
文摘目的:基于生物信息学分析NOG基因在胶质瘤中的表达和预后价值,并推断NOG对胶质瘤发生发展的生物学功能。方法:从CGGA和TCGA数据库分别下载并筛选后得到422例和702例胶质瘤的mRNA表达数据集和临床信息数据集,采用非参数秩和检验分析NOG的差异表达情况。采用Kaplan-Meier生存曲线法和Cox分析法分析NOG对胶质瘤预后的预测价值。将CGGA和TCGA数据库中NOG与其他基因进行相关分析,筛选出强相关基因后,进行GO和KEGG分析。将NOG与血管生成相关基因集(angiogenesis related gene set,ARGS)进行GSVA分析,与血管生成拟态(vasculogenic mimicry,VM)相关基因进行相关分析。结果:成人胶质瘤WHO分级较低,年龄较小,则NOG表达量较高(P<0.001)。NOG在IDH突变型、1p19q共缺失、MGMT甲基化的组别中显著高表达(P<0.001)。生存分析显示,NOG高表达组的OS显著高于低表达组(P<0.0001)。Cox回归分析显示,TCGA数据中NOG表达水平是OS的独立预后因素[HR=0.395,95%CI(0.236,0.662)]。GO和KEGG分析显示,NOG富集在神经系统发育、成骨细胞分化、血管生成、细胞黏附、整合素、黏着斑、细胞外基质结构、TGF-β信号通路和肿瘤的蛋白聚糖等。GSVA分析提示,NOG表达与血管生成呈负相关关系。相关分析表明,NOG表达与胶质瘤VM相关基因表达呈负相关关系。结论:NOG基因在胶质瘤中存在差异表达,与分子标志物关系密切,是OS的独立预后因素,NOG高表达提示预后良好。NOG可能与神经系统发育、骨骼系统发育、血管生成、细胞迁移、TGF-β信号通路和肿瘤细胞的迁移、侵袭、转移关系密切。NOG可能参与胶质瘤内皮依赖性血管生成的负调控,同时抑制血管生成拟态形成。NOG在胶质瘤发生发展中可能起抗肿瘤作用,可作为胶质瘤的预后指标和潜在的靶向治疗生物标志物。