摘要
针对传统的离职预测算法应用于现实中高维度小样本不平衡数据集时存在准确度低、易过拟合、鲁棒性差等问题,应用梯度增强集成分类器CatBoost算法,进行类别型特征处理,利用BOHB(Bayesian Optimization and Hyperband)寻找最优参数,结合交叉验证对模型分类性能进行评估,建立分类模型并对护士离职情况进行预测。该算法采用高维小样本不平衡特征的上海部分公立医院的护士离职数据集,并与XGBoost、随机森林、支持向量机进行对比。实验结果表明,该算法准确度高,鲁棒性强,能够有效地对护士离职进行预测。
Aiming at the problems of low accuracy,easy to overfitting and poor robustness caused by traditional prediction algorithms in high-dimensional non-balanced small data sets,this paper supposed an approach of predicting nursing turnover based on gradient-enhanced ensemble classifier CatBoost,processing category features,optimizing parameters with BOHB(Bayesian Optimization and Hyperband)and cross validation.Finally,the paper applies the algorithm in nursing turnover data of Shanghai public hospitals with high-dimensional non-balanced small data sets,compared with common algorithms such as XGBoost,Random Forest,Support Vector Machine.The experimental results show that the model is better in accuracy and has good robustness.This proposed model is a promising alternative for prediction of nursing turnover in the hospital human resources management.
作者
孙烨珩
SUN Ye-heng(Business School,University of Shanghai for Science&Technology,Shanghai 200093,China)
出处
《科技和产业》
2020年第12期227-232,246,共7页
Science Technology and Industry