摘要
推导了使用指数损失函数和0-1损失函数的Boosting算法的严格在线形式,证明这两种在线Boosting算法最大化样本间隔期望、最小化样本间隔方差.通过增量估计样本间隔的期望和方差,Boosting算法可应用于在线学习问题而不损失分类准确性.UCI数据集上的实验表明,指数损失在线Boosting算法的分类准确性与批量自适应Boosting(AdaBoost)算法接近,远优于传统的在线Boosting;0-1损失在线Boosting算法分别最小化正负样本误差,适用于不平衡数据问题,并且在噪声数据上分类性能更为稳定.
In this paper, strict derivation for the online form of Boosting algorithms using exponential loss and 0-1 loss is presented, which proves that the two online Boosting algorithms can maximize the average margin and minimize the margin variance. By estimating the margin mean and variance incrementally, Boosting algorithms can be applied to online learning problems without losing classification accuracy. Experiments on UCI machine learning datasets show that the online Boosting using exponential loss is as accurate as batch AdaBoost, and significantly outperforms the traditional online Boosting, and that the online Boosting using 0-1 loss can minimize classification errors of positive samples and negative samples at the same time, thus applies to imbalance data. Moreover, Boosting using 0-1 loss is more robust on noisy data.
出处
《自动化学报》
EI
CSCD
北大核心
2014年第4期635-642,共8页
Acta Automatica Sinica
基金
国家自然科学基金(60974129)资助~~