期刊文献+

基于集成学习和代价敏感的类别不平衡数据分类算法 被引量:1

Classification algorithm based on ensemble learning and cost-sensitiveness for class imbalance
下载PDF
导出
摘要 在数据分类分析中,一些特别的类别里往往存在更重要的信息。提出一种基于集成学习,欠采样和代价敏感的类别不平衡数据分类算法(USCensemble),来解决传统算法处理类别不平衡数据分类任务时难以正确识别少数类样本的问题。该算法首先运用EasyEnsemble的算法结构,在前一组数据训练完毕后,运用欠采样方法选取权重大的多数类样本,并将其与少数类样本结合为临时训练数据以此平衡数据集并进行下一轮训练。同时赋予少数类样本更大的错分代价,快速提高错误分类的少数类的样本权重,降低多数类的样本权重,使算法更倾向少数类的正确分类,达到对少数类样本正确识别的目的。在10个uci的数据集生成的分类任务上进行了对比实验,实验结果表明,该算法能更好地识别少数类样本。 In data classification analysis,significant information often exists in some special classes.In this paper,a classification algorithm(USCensemble)is proposed based on ensemble learning,undersampling and cost-sensitiveness for class imbalanced data,with aim to solve the problem that it is difficult to identify the minority class correctly in class imbalance data by traditional classification algorithm.USCensemble algorithm adopts the structure of EasyEnsemble.After training the previous group of data,a new subset of the majority class is selected according to sample weights obtained in the previous training.Then combine the new subset with the minority class together and treat it as the temporary training data set for the next step of training.In the process of training,higher misclassification cost is given to the minority class.This manipulation will lead to bigger weights of misclassified minority class sample and smaller weights of majority class sample.As a result,USCensemble algorithm is inclined to classify minority class correctly with higher accuracy rate.Ten UCI data sets are analysed in comparative experiment and the experiment outcome shows that USCensemble algorithm is competitive and has good performance in class-imbalance classification.
作者 贺指陈 HE Zhichen(School of Applied Mathematics,Guangdong University of Technology,Guangzhou,Guangdong 510520,China)
出处 《信息记录材料》 2022年第1期18-22,共5页 Information Recording Materials
关键词 类别不平衡数据 分类 集成学习 欠采样 代价敏感 Class imbalance data Classification Ensemble learning Undersampling Cost-sensitiveness
  • 相关文献

参考文献4

二级参考文献28

共引文献43

同被引文献7

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部