摘要
针对SAMME算法对于不平衡数据集的分类效果不理想,对不同数据集的适应能力弱的缺陷,将其和极限学习机(ELM)结合并进行有针对性的改进,根据样本分布对训练样本的初始化权值进行重新分配,对训练过程中样本的权值和弱分类器的权值更新策略进行改进,给予弱分类器一个与其对少数类样本识别能力成正比的奖励项,增强了所得分类器对难分类样本的敏感性,使最终集成分类器性能有了显著提升。经过该集成算法与组成该算法的子算法的对比实验,论文方法取得了更优的G-mean以及F1值,验证了论文算法的有效性。其次,论文算法和其他分类算法的对比实验结果表明论文算法在大多数数据集上同样可以取得更高的G-mean以及F1值,实现更优的分类效果。
The SAMME algorithm is not ideal for the classification of unbalanced data sets,and it has weak adaptability to dif⁃ferent data sets.It combines with the extreme learning machine(ELM)and makes targeted improvement.According to the sample distribution,the initialization weights of the training samples are redistributed accroding to the sample distribution,and the weights of the samples in the training process and the weight update strategies of the weak classifiers are improved,and the weak classifiers are given a reward item proportional to the recognition ability of the minority class samples,and the obtained classifier is enhanced.Sensitivity to difficult-to-classify samples has led to a significant increase in the performance of the final integrated classifier.Through the comparison experiment between the integrated algorithm and the sub-algorithms that make up the algorithm,the pro⁃posed method obtains better G-mean and F1 values,and verifies the effectiveness of the proposed algorithm.Secondly,the experi⁃mental results of the proposed algorithm and other classification algorithms show that the proposed algorithm can achieve higher G-mean and F1 values on most data sets,and achieve better classification results.
作者
李克文
丁胜夺
段鸿杰
LI Kewen;DING Shengduo;DUAN Hongjie(School of Computer and Communication Engineering,China University of Petroleum(East China),Qingdao 266580;Information Center of Sinopec Shengli Oilfield Branch,Dongying 257000)
出处
《计算机与数字工程》
2021年第6期1058-1062,1076,共6页
Computer & Digital Engineering