期刊文献+

基于XGB-BFS特征选择算法的电信客户流失预测 被引量:6

Prediction of Telecom Customer Churn Based on XGB-BFS Feature Selection Algorithm
下载PDF
导出
摘要 客户流失是现代企业面临最困难的问题,对客户流失进行预测是电信业保留现有客户的最有效策略之一。电信客户数据集往往具有高维特征,选择重要特征并减少无关属性的数量可以提高模型的分类性能。针对客户流失数据集高维特征的问题,提出了一种混合的XGB-BFS特征选择方法。首先基于XGBoost的Fscore值对特征重要性排序来度量特征与目标变量之间的相关关系,然后使用序列后向搜索的方法依次删除重要性最低的特征,根据验证集的AUC值判断是否保留该特征,最后将选择的特征子集用于构建XGBoost客户流失预测模型。在电信客户流失数据集上的实验结果表明,该方法能够筛选出特征重要性较高的特征且删除了冗余特征,与基于递归特征消除的Logistic模型、基于Embedded的Adaboost和随机森林模型相比,具有良好的性能。 Customer churn is one of the most difficult problems faced by modern enterprises.Predicting customer churn is one of the most effective strategies for telecom industry to retain existing customers.Telecom customer data sets often have high dimensional features.Selecting important features and reducing the number of irrelevant attributes can improve the classification performance of the model.Focused on the problem of high dimensional features in customer churn data set,a hybrid XGB-BFS method is proposed.Firstly,the importance of features based on the Fscore value of XGBoost is ranked to measure the correlation between features and target variables,and then the features are selected by sequential backward selection.This method sequentially deletes the lowest important feature and judges whether to retain the features according to AUC value of the validation set.Finally,using the selected feature subset to construct XGBoost customer churn prediction model.The experiment on the data set of telecom customer shows that the proposed method can filter out the features with high importance and remove redundant features.Compared with the Logistic model based on recursive feature elimination,Embedded Adaboost model and random forest model,the proposed algorithm has better performance.
作者 冀慧杰 倪枫 刘姜 陆祺灵 张旭阳 阙中力 JI Hui-jie;NI Feng;LIU Jiang;LU Qi-ling;ZHANG Xu-yang;QUE Zhong-li(Business School,University of Shanghai for Science&Technology,Shanghai 200093,China)
出处 《计算机技术与发展》 2021年第5期21-25,共5页 Computer Technology and Development
基金 国家自然科学基金(11701370)。
关键词 客户流失预测 特征选择 XGBoost 特征重要性 序列后向搜索 customer churn prediction feature selection XGBoost feature importance sequential backward selection
  • 相关文献

参考文献11

二级参考文献83

共引文献199

同被引文献44

引证文献6

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部