期刊文献+

基于机器学习算法开发与验证类风湿关节炎患者发生间质性肺疾病的早期诊断模型

Development and validation of a diagnostic model based on machine learning algorithms for the development of interstitial lung diseases in patients with rheumatoid arthritis
原文传递
导出
摘要 目的筛选类风湿关节炎(RA)并发间质肺疾病(ILD)的影响因素,构建早期诊断模型并进行验证。方法于2019年12月至2022年10月选取山西医科大学第二医院风湿免疫科712例RA患者为研究对象,收集其一般资料、临床症状、实验室指标等52个变量,根据患者是否发生ILD分为单纯RA组和RA-ILD组。数据预处理后按照7∶3随机分为建模组和验证组;单因素分析比较2组患者的基线特征;使用最小绝对收缩算子(LASSO)回归和支持向量机-递归特征消除(SVM-RFE)算法进行特征选择;将筛选指标进行Logistic回归分析,并利用其分析结果构建RA并发ILD的早期诊断列线图(Nomograms)模型,同时利用建模组数据进行模型内部评价及验证组数据进行内部验证。结果共纳入712例研究对象,其中建模组498例,验证组214例;单因素分析显示,2组在男性、年龄、吸烟史、饮酒史、肿胀关节数、疼痛关节数、醋酸泼尼松、白细胞计数(WBC)、ESR、CRP、IL-2、IL-10、IL-17、TNF-α、INF-γ、AFA家族、抗核周因子抗体(APF)、血清白蛋白等18个特征指标差异有统计学意义(P<0.05);LASSO算法识别出13个RA-ILD的风险变量、SVM-RFE算法识别出12个风险变量,交集风险变量为男性、年龄、饮酒史、疼痛关节数、醋酸泼尼松、IL-2、AFA家族、TNF-α、血清白蛋白、IL-10;Logistic回归分析结果证实:男性OR值(95%CI)=3.61(2.11,6.18)、年龄OR值(95%CI)=1.05(1.03,1.08)、疼痛关节数OR值(95%CI)=1.03(1.01,1.06)、IL-2OR值(95%CI)=0.91(0.84,0.99)、TNF-αOR值(95%CI)=1.06(1.02,1.10),是RA并发ILD的独立影响因素(P<0.05);利用其构建早期诊断列线图模型的建模组和验证组的校准曲线准确度较高;受试者工作特征(ROC)曲线下面积(AUC)及临床决策曲线分析(DCA)证实该模型具有较高诊断能力,模型建模组AUC(95%CI)=0.76(0.71,0.81),净获益率为3%~82%及93%~99%;而模型验证组的AUC(95%CI)=0.71(0.64,0.79),净获益率为5%~11%、14%~60%及85%~89%。结论男性、年龄、疼痛关节数、IL-2、TNF-α是RA并发ILD的独立影响因素,通过其构建的Nomogram模型在疾病早期诊断方面具有良好的性能。 Objective Screening factors that might influence rheumatoid arthritis(RA)complicating interstitial lung diseases(ILD)by constructing and validating a model for early diagnostic.Methods The study subjects were composed of 712 RA patients in the Department of Rheumatology and Immunology of the Second Hospital of Shanxi Medical University during December 2019 to October 2022.Fifty-two variables such as their demographic data,clinical symptoms,and laboratory indexes were collected.Patients were categorized into RA-only group and RA-ILD group with or without the occurrence of ILD disease.After data preprocessing,subjects were randomly assigned to the modeling and validation groups in a 7:3 ratio.Univariate analysis comparing baseline characteristics of the two groups of patients.Feature selection was performed using LASSO and SVM-RFE regression algorithms.Screening indicators were analyzed by logistic regression and the results were used to develop a nomograms model for the early diagnosis of RA complicating interstitial lung disease;and the modeling group was evaluated for its performance for internal assessment of the model and internal validation using data from the validation group.Results A total of 712 subjects participated in the study,of which 498 in the modeling group and 214 in the validation group.Univariate analysis showed that the differences between the two groups were statistically significant(P<0.05)in 18 characteristic indexes,including male,gender,age,smoking history,drinking history,number of swollen joints,number of painful joints,use of prednisone,WBC,ESR,CRP,IL-2,IL-10,IL-17,TNF-α,INF-γ,AFA family,APF,and serum albumin.The LASSO algorithm identified 13 risk variables for RA-ILD,the SVM-RFE algorithm identified 12 variables for RA-ILD,and the intersecting risk variables were male,age,history of alcohol consumption,number of painful joints,prednisone acetate,IL-2,AFA family,TNF-α,serum albumin,and IL-10.The results of multifactorial logistic regression analysis confirmed that the differences between males[OR(95%CI)=3.61(2.11,6.18)],gender,age[OR(95%CI)=1.05(1.03,1.08)],number of painful joints[OR(95%CI)=1.03(1.01,1.06)],IL-2[OR(95%CI)=0.91(0.84,0.99)],and TNF-α[OR(95%CI)=1.06(1.02,1.10)]were statistically significant(P<0.05)and were independently influences on ILD complicated by RA.The modeling and validation groups that were used to construct early diagnostic Nomograms had high calibration curve accuracies,and the model had a high diagnostic power,which was mainly demonstrated by the receiver operating characteristic(ROC)area under the curve(AUC)and decision curve analysis(DCA),the model modeling group had an AUC of 0.76(95%CI=0.71,0.81),with net benefit rates of 3%~82% and 93%~99%,whereas the model validation group had an AUC of 0.71(95%CI=0.64,0.79),with net benefit rates of 5%~11%,14%~60% and 85%~89%.Conclusion Male,gender,age,number of painful joints,IL-2,and TNF-αare independent factors for RA complicated with ILD,and the Nomogram model constructed has good performance in early diagnosis of the disease.
作者 聂艳聪 靳岩青 尹梅林 王晓霞 仇丽霞 Nie Yancong;Jin Yanqing;Yin Meilin;Wang Xiaoxia;Qiu Lixia(Department of Public Health,Shanxi Medical University,Taiyuan 030001,China;Second Clinical School of Medicine,Shanxi Medical University,Taiyuan 030001,China;Department of Rheumatology and Immunology,Second Hospital of Shanxi Medical University,Taiyuan 030001,China)
出处 《中华风湿病学杂志》 CAS CSCD 2024年第3期167-175,I0004,共10页 Chinese Journal of Rheumatology
关键词 关节炎 类风湿 肺疾病 间质性 机器学习 诊断模型 Arthritis,rheumatoid Lung diseases,interstitial Machine learning Diagnostic model
  • 相关文献

参考文献8

二级参考文献44

共引文献166

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部