摘要
收集乳腺癌的预后状况,通过Logistic回归模型、决策树和随机森林算法,分析比较不同变量对乳腺癌预后的敏感性,并比较特征变量的重要性。使用Logistic回归的多变量分析显示,年龄、N-stage、乳腺癌内分泌治疗、化疗手段、多灶性病灶、化疗与预后有关。随机森林分析的结果得到了各个因素指标的重要性评分,有助于正确选择高风险人群,有利于建立分类系统。通过该结果找到危险因素,建立较高准确率的预测模型,预测乳腺癌患者的生存能力,并可作为制定医疗决策的参考。根据患者的特征制定针对患者不同特征的个体化术后随访及辅助治疗策略,争取最大程度上预防乳腺癌的术后复发。
The prognosis of breast cancer is collected. The sensitivity of different variables to the prognosis of breast cancer is analyzed and compared through Logistic regression model, decision tree and random forest algorithm, and the significances of characteristic variables are compared. The multivariate analysis using Logistic regression shows that age, N-stage, endocrine therapy for breast cancer, chemotherapy, multifocal lesions and chemotherapy are related to prognosis. The importance score of each factor index is obtained from the results of random forest analysis, which can help to correctly select high-risk groups and establish the classification system. Based on the results, the risk factors are found, and a prediction model with high accuracy is established to predict the survival ability of patients with breast cancer, which can be used as a reference for making medical decisions. According to the characteristics of patients, individualized postoperative follow-up and adjuvant treatment strategies for different characteristics of patients are developed to maximize the prevention of postoperative recurrence of breast cancer.
作者
王哲
王凯
杨日东
周毅
WANG Zhe;WANG Kai;YANG Ri-dong(School of Engineering and Technology,Xinjiang Medical University,Urumqi 830000,Xinjiang Uygur Autonomous Region,P.R.C)
出处
《中国数字医学》
2019年第1期18-20,46,共4页
China Digital Medicine
基金
国家自然科学基金项目(编号:61876194
11661007)
国家重点研发计划项目(编号:2018YFC0116902
2018YFC0116904
2016YFC0901602)
NSFC-广东大数据科学中心联合基金项目(编号:U1611261)
广州市科技计划项目(编号:201604020016)~~