摘要
为解决由多个二类代价敏感算法扩展而成的多类算法存在时间复杂度高和不能区分错分代价的问题,提出一种采用多类代价指数损失函数的多类代价敏感AdaBoost算法(MCCSADA)。为保证算法的代价敏感特性,首先设计一种满足代价敏感损失函数设计准则的多类代价敏感指数损失函数;然后将此损失函数作为评价分类器性能的标准,以最小化损失函数为目的使用逐步叠加模型推导算法的最优基分类器加权系数;最后使用多类代价损失函数和最优基分类器加权系数求解公式替换多类AdaBoost算法的损失数和加权系数求解公式,得到代价敏感的MCCSADA算法。使用UCI数据集对算法进行验证,实验结果表明:算法的稳定性得到了提升,退化现象被减弱;相比于由两类代价敏感算法通过一对一方法扩展而来的多类代价敏感算法,MCCSADA算法在大多数情况下能够取得更低的代价,而且具有较低的时间复杂度,在3类数据集上的时间复杂度降低约40%,并且随着类别数的增多效率提升更加明显。
A multi-class cost sensitivity AdaBoost algorithm is proposed to solve the problems of high time complexity and indistinguishable cost among different classes in using the existing multi-class algorithm,which is an extension of some binary cost sensitive AdaBoost algorithms.The new algorithm uses a multi-class exponential loss function,and is named as MCCSADA.A cost sensitive multi-class exponential loss function is designed to satisfy design guidelines of costsensitive loss function and to ensure the cost-sensitive characteristic.Then,the loss function is used as a criterion for the evaluation of basis classifiers and the optimal weighted coefficients of base classifiers are obtained by using the forward stack model to minimize the cost loss function.Subsequently,MCCSADA is obtained by using the new loss function and weighted coefficients to replace the original loss function and coefficients in AdaBoost algorithm.MCCSADA is verified by using UCI dataset,and the results and a comparison with the CSOVO expanding from binary algorithm show that MCCSADA has lower cost and lower time complexity in most cases.The time complexity reduces by about 40% when the dataset contains three classes and reduces more with increasing number of data categories.Moreover,the stability of the algorithm is promoted,and the degeneration is weakened.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2017年第8期33-39,共7页
Journal of Xi'an Jiaotong University
基金
国家自然科学基金资助项目(61273275
61503407)