摘要
瘢痕疙瘩为伤口皮肤结缔组织过度增生引起的良性皮肤肿瘤。精准预测创伤者瘢痕疙瘩患病风险并及时做出早期诊断,对深度管理瘢痕疙瘩和控制其发展具有重大意义。本研究对高通量基因表达综合(GEO)数据库中的4个瘢痕疙瘩数据集进行分析,筛选出瘢痕疙瘩的诊断标志物,并建立列线图预测模型。首先,通过加权基因共表达网络分析(WGCNA)、差异表达分析和蛋白质互作网络中心性算法,筛选出37个核心蛋白质编码基因。随后,利用最小绝对值收敛和选择算子(LASSO)以及支持向量机-递归特征消除(SVM-RFE)两种机器学习算法,从中筛选出4个最具预测能力的瘢痕疙瘩诊断标志物,分别为肝细胞生长因子(HGF)、多配体蛋白聚糖4(SDC4)、外核苷酸焦磷酸酶/磷酸二酯酶2(ENPP2)和Rho家族三磷酸鸟苷酶3(RND3),并通过单基因的基因集富集分析(GSEA)探索可能涉及的生物途径。最后,对诊断标志物进行单因素与多因素逻辑回归分析,并构建列线图预测模型。经内外部验证发现,该模型校准曲线贴近理想曲线,决策曲线优于其他策略,接受者操作特征曲线下面积高于对照模型(最佳截断值为0.588),表明该模型具有较高的校准度、临床收益率以及预测能力,有望为临床诊断提供有效先期手段。
Keloids are benign skin tumors resulting from the excessive proliferation of connective tissue in wound skin.Precise prediction of keloid risk in trauma patients and timely early diagnosis are of paramount importance for indepth keloid management and control of its progression.This study analyzed four keloid datasets in the high-throughput gene expression omnibus(GEO)database,identified diagnostic markers for keloids,and established a nomogram prediction model.Initially,37 core protein-encoding genes were selected through weighted gene co-expression network analysis(WGCNA),differential expression analysis,and the centrality algorithm of the protein-protein interaction network.Subsequently,two machine learning algorithms including the least absolute shrinkage and selection operator(LASSO)and the support vector machine-recursive feature elimination(SVM-RFE)were used to further screen out four diagnostic markers with the highest predictive power for keloids,which included hepatocyte growth factor(HGF),syndecan-4(SDC4),ectonucleotide pyrophosphatase/phosphodiesterase 2(ENPP2),and Rho family guanosine triphophatase 3(RND3).Potential biological pathways involved were explored through gene set enrichment analysis(GSEA)of single-gene.Finally,univariate and multivariate logistic regression analyses of diagnostic markers were performed,and a nomogram prediction model was constructed.Internal and external validations revealed that the calibration curve of this model closely approximates the ideal curve,the decision curve is superior to other strategies,and the area under the receiver operating characteristic curve is higher than the control model(with optimal cutoff value of0.588).This indicates that the model possesses high calibration,clinical benefit rate,and predictive power,and is promising to provide effective early means for clinical diagnosis.
作者
李政宇
田保华
梁海霞
LI Zhengyu;TIAN Baohua;LIANG Haixia(College of Biomedical Engineering,Taiyuan University of Technology,Taiyuan 030024,P.R.China)
出处
《生物医学工程学杂志》
EI
CAS
北大核心
2023年第4期725-735,共11页
Journal of Biomedical Engineering
基金
国家自然科学基金青年项目(31501124)。
关键词
瘢痕疙瘩
加权基因共表达网络分析
最小绝对值收敛和选择算子
支持向量机-递归特征消除
列线图预测模型
Keloids
Weighted gene co-expression network analysis
Least absolute shrinkage and selection operator
Support vector machine-recursive feature elimination
Nomogram prediction model