期刊文献+

C5.0算法的改进及应用 被引量:11

An improvement and application of C5.0 algorithm
下载PDF
导出
摘要 C5.0算法是一种直观、效率高的分类方法,但该算法存在信息增益率计算复杂、容易出现过拟合和决策树偏倚的问题。针对这些问题,通过公式的转换简化信息增益率的计算过程,在剪枝过程采用了损失矩阵和置信区间的结合进行剪枝判断,以及对建立的多个模型的权重进行调整,提出了一种新的C5.0改进算法,并将其应用于信贷逾期预测上。使用借款人的历史还款数据进行实验,并与其他算法进行比较,结果表明:C5.0改进算法相比其他算法具有更高的准确率和效率。 C5. 0 algorithm is a classification method with intuitive and efficient,and has problems like information gain rate calculation is complex,prone to over-fitting and decision tree bias. Aiming to solve these problems,a new improved C5. 0 algorithm was proposed in this paper,which by converting formulas to simplify the calculation procedure of information gain rate,pruning judgment through using a combination of loss matrix and a confidence interval,and adjusting the weights of the established models. It was applied to the prediction of overdue credit. Finally,conduct an experiment in borrower's historical repayment data,and compared it with other algorithms. The results showed that the improved C5. 0 algorithm has higher accuracy and efficiency than other algorithms.
作者 罗丽娟 段隆振 段文影 刘萍 LUO Lijuan DUAN Longzhen DUAN Wenying LIU Ping(School of Information Engineering, Nanchang University, Nanchang 330031, China)
出处 《南昌大学学报(工科版)》 CAS 2017年第1期92-97,共6页 Journal of Nanchang University(Engineering & Technology)
基金 国际自然科学基金资助项目(61070139 81460769)
关键词 C5.0算法 信息增益率 置信区间 权重调整 信贷逾期 C5.0 algorithm rate of information gain confidence interval weight adjustment overdue credit
  • 相关文献

参考文献4

二级参考文献29

  • 1蒋国瑞,司学峰.基于代价敏感SVM的电信客户流失预测研究[J].计算机应用研究,2009,26(2):521-523. 被引量:21
  • 2王雷,陈松林,顾学道.客户流失预警模型及其在电信企业的应用[J].电信科学,2006,22(9):47-51. 被引量:17
  • 3Application of Multi level Compressed Decision Tree in Computer Forensics[C]//Proceedings 2010 IEEE 2nd Symposium on Web Society, 2010.
  • 4Zalewska A M. Relationships between anxiety and job satisfaction-Three approaches: ' Butoom-up', ' Buttom-down' and 'transactional'[J]. Personality and Individual Differenees,2011, 5(50) ;977-986.
  • 5Liu Zongtian. Research on Logic Rules for Refinement Learning [C] //Proceedings of 2010 International Colloquium on Com- puting, Communication, Control and Management (CCCM2010), 2010.
  • 6Ouo Y J. Research and Implementation of Future Network Computer based on Cloud Computing [C]//Proceedings of 2010 Third International Symposium on Knowledge Acquisition and Modeling (KAM 2010), 20/10.
  • 7Wang L Z, Laszewski G V, Dayal J,et al. Towards energy aware scheduling for precedence constrained parallel tasks in a cluster with DVFS[C]//Procecdings of the 10th IEEE/ACM Int'l Conf on Cluster, Cloud and Grid Computing. Melbourne: IEEE Computer Society, 2010.
  • 8Chang Cheng-chang, Gong Dah-chuan. A Comparison of Rohs Risk Assessment using the Logistic Regression Model and Artificial Neural Network Model [C]. Proceedings of the Ninth International Conference on Machine Learning and Cy- bernetics, 2010.
  • 9Duan Fu, Zhao Zheng-xing, Zeng Xiang-dong. Application of Decision Tree based on CA. 5 in Analysis of Coal Logistics Customer[C]. 2009 Third International Symposium on Intelligent Information Technology Application, 2009.
  • 10Ma Zhi-qiang, Hong Tao. Performance Evaluation of Enterprise's Marketing Team Members based on BP Neural Net- work [C]. International Workshop on Intelligent Systems and Applications, 2009.

共引文献50

同被引文献136

引证文献11

二级引证文献70

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部