摘要
目的探讨条件推断森林在生存分析中的应用与优势。方法通过模拟研究和实例应用比较比例风险模型、加速失效时间模型、随机生存森林、条件推断森林4种方法的预测能力,用Brier score进行评价。结果模拟研究显示两类森林模型比其他2种回归模型预测更准确稳定,其中条件推断森林在数据存在多分类变量、共线性、交互作用等情况下预测效果优于其余3种模型,且在大样本、高删失率数据中更容易体现该优势;实例说明条件推断森林预测效果最优。结论条件推断森林可用于生存分析,且当存在多分类变量、共线性、交互作用时,与其他常见生存分析方法相比,具有更高的准确性和稳定性。
Objective To explore the application and advantages of conditional inference forest in survival analysis. Methods We used simulated experiment and actual data to compare the predictive performance of 4 models, including Coxproportional hazards model, accelerated failure time model, random survival forest model and conditional inference forest model based on their Brier scores. Results Simulation experiment suggested that both of the two forest models had more accurate and robust predictive performance than the other two regression models. Conditional inference forest model was superior to the other models in analyzing time-to-event data with polytomous covariates, collinearity or interaction, especially for a large sample size and a high censoring rate. The results of actual data analysis demonstrated that conditional inference forest model had the best predictive performance among the 4 models. Conclusion Compared with the commonly used survival analysis methods, conditional inference forest model performs better especially when the data contain polytomous covariates with collinearity and interaction.
作者
刘颖欣
康佩
许军
安胜利
LIU Yingxin;KANG Pei;XU Jun;AN Shengli(Department of Biostatistics,School of Public Health,Southern Medical University,Guangzhou 510515,China;Department of Economic Management,Nanfang Hospital,Southern Medical University,Guangzhou 510515,China)
出处
《南方医科大学学报》
CAS
CSCD
北大核心
2020年第4期475-482,共8页
Journal of Southern Medical University
基金
国家自然科学基金(71673126)
南方医科大学科研启蒙项目(B219339036)。
关键词
条件推断森林
随机生存森林
比例风险模型
加速失效模型
生存分析
conditional inference forests
random survival forests
proportional hazards models
accelerated failure time models
survival analysis