摘要
目的对山西省某医院2011-2017年确诊为弥漫大B细胞淋巴瘤(diffuse large B-cell lymphoma, DLBCL)患者进行疾病进展阶段多分类预测,为患者是否需要及时转为二线挽救治疗或放疗等治疗手段的选择提供参考。方法用层次分类法将三分类的疾病进展阶段进行两层二分类,分别进行变量筛选后,用SMOTE过采样处理数据中的类别不平衡问题,然后使用SVM、BP神经网络、随机森林等单分类器模型与AdaBoost同型集成和Stacking异型集成方法分别构建两层疾病进展阶段的二分类预测模型,最后分别选择两层中分类性能最优的模型并结合在一起。结果使用经SMOTE平衡后的数据构建的两层分类模型中的SVMboost集成模型,准确率分别为0.951和0.972,模型性能均为最优,因此两层二分类的基分类器均选择SVMboost。结论本研究构建弥漫大B细胞淋巴瘤患者疾病进展阶段的层次多分类预测模型,其中两层分类模型中的SVMboost集成模型性能均为最优,将两层二分类的基分类器结合后,准确率为0.924,高于作为对比的直接多分类模型,为临床工作者的诊断与治疗方案选择提供一定参考。
Objective To investigate the multiclassification prediction of diffuse large b-cell lymphoma(DLBCL)diagnosed in a hospital in Shanxi Province from 2011 to 2017,and to provide reference for whether the patients should be transferred to second-line salvage treatment or radiotherapy.Methods Using hierarchical classification classifies three stages of disease progression on two layers of binary classification,after variable screening,using SMOTE to deal with the class imbalance problems in data,and then use the SVM and BP neural network and random forest type single classifier model with AdaBoost integration and Stacking heterotype integration method to build two layers which the binary classification prediction model of disease progression,finally choose two layer classification performance of the optimal model and unifies in together.Results The accuracy of SVMboost integrated model in two-layer classification model was 0.951 and 0.972,respectively,and the performance of the model was optimal.Therefore,SVMboost was selected as the base classifier for two-layer dichotomy.Conclusion This study constructs a hierarchical multi classification prediction model for the disease progression stage of patients with diffuse large B-cell lymphoma.The performance of the svmboost integrated model in the two-layer classification model is the best.The accuracy of the combination of the two-layer two classification base classifiers is 0.924,which is higher than the classical direct multi classification models as a comparison.It is the choice of diagnosis and treatment scheme for clinical workers Provide some reference.
作者
黄雪倩
张岩波
王蕾
郑楚楚
余红梅
范双龙
阳桢寰
邢蒙
赵志强
罗艳虹
Huang Xueqian;Zhang Yanbo;Wang Lei(Department of Health Statistics,School of Public Health,Shanxi Medical University,030000,Taiyua)
出处
《中国卫生统计》
CSCD
北大核心
2021年第2期167-170,176,共5页
Chinese Journal of Health Statistics
基金
国家自然科学基金青年科学基金(81502897)
山西医科大学博士启动基金(BS2017029)
国家自然科学基金面上项目(81973154)。
关键词
弥漫大B细胞淋巴瘤
层次分类法
多分类
不平衡数据
Diffuse large B-cell lymphoma
Hierarchical classification
Multi-classification
Imbalanced data