期刊文献+

考虑不平衡指数的不平衡数据集分类设计方法

Classification design method of unbalanced data sets considering unbalanced index
下载PDF
导出
摘要 数据集类别不平衡问题是分类领域的重要问题之一,每个数据集的不平衡指数都与其自身有着紧密的联系,是数据集的重要标志。面对不平衡数据集分类设计问题,提出了一种改进AdaBoost算法(enhanced AdaBoost, E-AdaBoost)。该算法将不平衡指数和不平衡数据集中较为重要的少数类分类正确率考虑到算法的迭代过程中,改进了基分类器的权重更新策略,进而提高对不平衡数据集的分类性能。基于E-AdaBoost的不平衡数据集分类设计方法可以根据样本的不平衡指数,确定基分类器的权重参数,进而提高分类器性能。利用该方法,结合多个经典分类器,在人工数据集和标准数据集上进行实验分析,并对比相关方法,结果表明,基于E-AdaBoost的不平衡数据集分类设计方法能够有效提高不平衡数据集的分类性能。 The imbalance of data sets category is one of the important problems in the classification field.The unbalanced index of each data set is closely related to itself,it is a key indicator of data sets.To deal with the classification design of unba-lanced data sets,this paper proposed an enhanced AdaBoost(E-AdaBoost)algorithm.In the process of iteration,the algorithm took into account unbalanced index,and the classification accuracy of the minority classed that was more important in unba-lanced data sets improving the weight updating strategy of the base classifier,and thus promoting the classification performance of unbalanced data sets.The classification design method of unbalanced data sets based on E-AdaBoost could determine the weight parameters of the base classifier according to the sample unbalanced index,so as to improve the performance of the classifier.With this method that was combined with multiple classical classifiers,this paper carried out experimental analysis in terms of artificial data sets and standard data sets,and compared with relevant methods.The results show that the classification design method of unbalanced data sets based on E-AdaBoost can effectively improve the classification performance of unba-lanced data sets.
作者 周玉 岳学震 孙红玉 Zhou Yu;Yue Xuezhen;Sun Hongyu(School of Electrical Engineering,North China University of Water Resources&Electric Power,Zhengzhou 450011,China)
出处 《计算机应用研究》 CSCD 北大核心 2023年第12期3566-3571,3577,共7页 Application Research of Computers
基金 国家自然科学基金资助项目(U1504622,31671580) 河南省高等学校青年骨干教师培养计划项目(2018GGJS079)。
关键词 不平衡分类 改进AdaBoost 不平衡指数 权重 unbalanced classification enhanced AdaBoost unbalanced index weight
  • 相关文献

参考文献7

二级参考文献51

  • 1费洪磊,袁琦,郑玉叶.基于深度学习的癫痫脑电不平衡分类方法[J].仪器仪表学报,2021,42(3):231-240. 被引量:10
  • 2Breiman L, Friedman J H, Olshen R A, Stone C J. Classification and Regression Trees. Belmont, USA: Wadsworth, 1984.
  • 3Chan P, Stolfo S. Toward Scalable Learning with Non-Uniform Class and Cost Distributions. In: Proc of the 4th International Conference on Knowledge Discovery and Data Mining. New York, USA, 1998, 164-168.
  • 4Provost F, Fawcett T, Kohavi R. The Case against Accuracy Estimation for Comparing Induction Algorithms. In: Proc of the 15th International Conference on Machine Learning. Madison,USA, 1998, 445-453.
  • 5Domingos P. MetaCost: A General Method for Making Classifiers Cost-Sensitive. In: Proc of the 5th International Conference on Knowledge Discovery and Data Minging. San Diego, USA,1999, 155-164.
  • 6Domingos P. Knowledge Acquisition from Examples via Multiple Models. In: Proc of the 14th International Conference on Machine Learning. Nashville, USA, 1997, 98-106.
  • 7Bruha I, Kockova S. A Support for Decision Making: Cost-Sensitive Learning System. Artificial Intelligence in Medicine,1994, 6(7): 67-82.
  • 8Turney P. Cost-Sensitive Learning Bibliography. 1997. http://-ai. lit. nrc,ca/bibliographies/cost-sensitive, html.
  • 9Quinlan J R. C4. 5: Program for Machine Learning. San Marco,USA: Morgan Kaufmann, 1993.
  • 10Ting K M. An Instance-Weighting Method to Induce Cost-Sensitive Trees. IEEE Trans on Knowledge and Data Engineering,2002, 14(3): 659-665.

共引文献120

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部