期刊文献+

基于新型不纯度度量的代价敏感随机森林分类器 被引量:8

Cost-sensitive Random Forest Classifier with New Impurity Measurement
下载PDF
导出
摘要 针对不平衡数据集的有效分类问题,提出一种结合代价敏感学习和随机森林算法的分类器。首先提出了一种新型不纯度度量,该度量不仅考虑了决策树的总代价,还考虑了同一节点对于不同样本的代价差异;其次,执行随机森林算法,对数据集作K次抽样,构建K个基础分类器;然后,基于提出的不纯度度量,通过分类回归树(CART)算法来构建决策树,从而形成决策树森林;最后,随机森林通过投票机制做出数据分类决策。在UCI数据库上进行实验,与传统随机森林和现有的代价敏感随机森林分类器相比,该分类器在分类精度、AUC面积和Kappa系数这3种性能度量上都具有良好的表现。 For the problem of effective classification on imbalanced data sets,a classifier combining cost-sensitive learning and random forest algorithm is proposed.Firstly,a new impurity measure is proposed,taking into account not only the total cost of the decision tree,but also the cost difference of the same node for different samples.Then,the random forest algorithm is executed,K times sampling for the data set is performed,and K basic classifiers are built.Then,the decision tree is constructed by the classification regression tree (CART) algorithm based on the proposed impurity measure,so as to form the decision tree forest.Finally,the random forest algorithm makes the data classification decision by voting mechanism.In the UCI database,compared with the traditional random forest and the existing cost-sensitive random forest classifier,this classifier has good performance in the classification accuracy,AUC area and Kappa coefficient.
出处 《计算机科学》 CSCD 北大核心 2017年第B11期98-101,共4页 Computer Science
关键词 代价敏感学习 随机森林 不纯度度量 分类回归树(CART) 不平衡数据 Cost-sensitive learning, Random forest, Impurity measurement, Classification regression tree (CART ), Imbalanced data
  • 相关文献

参考文献6

二级参考文献56

  • 1闫友彪,陈元琰.机器学习的主要策略综述[J].计算机应用研究,2004,21(7):4-10. 被引量:55
  • 2朱江华,李海波,潘丰.基于遗传算法和模糊粗糙集的知识约简[J].计算机仿真,2007,24(1):86-89. 被引量:11
  • 3王涛,李舟军,胡小华,颜跃进,陈火旺.一种高效的数据流挖掘增量模糊决策树分类算法[J].计算机学报,2007,30(8):1244-1250. 被引量:18
  • 4P D Turney. Types of cost in inductive concept learning[ A]. Proc of the Workshop on Cost-Sensitive Learning at the 17th International Conference on Machine Learning[ C]. Stanford University, Stanford, California, USA, 2000.
  • 5C X Ling, V S Sheng, Q Yang. Test strategies for cost-sensitive decision trees[ J]. IEEE Transactions on Knowledge and Data Engineering,2006.18[8]: 1055 - 1067.
  • 6Q Yang, C IAng,X Chai, et al. Test-cost sensitive classification on data with missing values[ J]. IEEE Transactions on Knowledge and Data Engineering,2006,18[ 5 ] :626 - 638.
  • 7P D Tumey. Cost-sensitive classification: empirical evahmtion of a hybrid genetic decision tree induction algorithm[J]. Journal of Artificial Intelligence Research, 1995,2: 369 - 409.
  • 8J V Davis, J Ha, C J Rossbach, et al. Cost-sensitive decision tree learning for forensic classification[A]. Proc of 17th European Conference on Machine Learning [ A ]. Berlin GER- MANY, SEP 18- 22,2006:622-629.
  • 9X Liu. Cost-sensitive decision tree with missing values and multiple cost scales[ A]. Proc of the first IITA International Joint Conference on Artificial InteUigence [ C ]. Halnan, CHINA,APR 25-MAY 26,2009:294 - 297 M Gong,L Jiao,H Du,et al.Multiobjective immune algorithm with nondominated neighbor-based selection. EvolutionaryComputation[ J] .2008,16[2] :225 - 255.
  • 10M Gong,L Jiao,H Du,et al.Multiobjective immune algorithm with nondominated neighbor-based selection. EvolutionaryComputation[ J]. 2008,16[ 2 ] : 225 - 255.

共引文献120

同被引文献61

引证文献8

二级引证文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部