摘要
P-AdaBoost通过改良使传统AdaBoost算法的核心步骤可以被并行执行,极大提高了算法的执行效率。然而P-AdaBoost没有考虑到噪声样本对训练结果造成的负面影响。通过分析P-AdaBoost算法,修改原算法中初始权重分布,并提出一种噪声检测算法,改良P-AdaBoost算法在带有噪声数据集上的性能。实验结果表明,改进后的算法与原P-AdaBoost算法相比,在带有噪声的数据集上提高了将近5个百分点,在无噪声的数据集上也有一定提高。由此证明,提出的算法是一种更健壮的算法,在大部分数据集上均取得更高的分类准确率。
P-AdaBoost improved the traditional AdaBoost algorithm flow, the core steps of AdaBoost algorithm can be implemented in parallel to improve the efficiency of the algorithm. However P-AdaBoost does not take into account the negative impact of noise samples on training results. The original P-AdaBoost algorithm is analyzed to modify the initial weight distribution in the original algorithm, and a noise detection algorithm is proposed to improve the performance of the P-AdaBoost algorithm with the noise data set. The experimental results show that the improved algorithm has improved by almost 5 percentage points compared with the original P-AdaBoost algorithm in the data set with noise and no-noise data set. It is proved that the proposed algorithm is a more robust algorithm, and achieves higher classification accuracy in most data sets.
出处
《计算机应用与软件》
北大核心
2018年第1期288-294,共7页
Computer Applications and Software