期刊文献+

多标签代价敏感分类集成学习算法 被引量:23

Cost-sensitive Ensemble Learning Algorithm for Multi-label Classification Problems
下载PDF
导出
摘要 尽管多标签分类问题可以转换成一般多分类问题解决,但多标签代价敏感分类问题却很难转换成多类代价敏感分类问题.通过对多分类代价敏感学习算法扩展为多标签代价敏感学习算法时遇到的一些问题进行分析,提出了一种多标签代价敏感分类集成学习算法.算法的平均错分代价为误检标签代价和漏检标签代价之和,算法的流程类似于自适应提升(Adaptive boosting,AdaBoost)算法,其可以自动学习多个弱分类器来组合成强分类器,强分类器的平均错分代价将随着弱分类器增加而逐渐降低.详细分析了多标签代价敏感分类集成学习算法和多类代价敏感AdaBoost算法的区别,包括输出标签的依据和错分代价的含义.不同于通常的多类代价敏感分类问题,多标签代价敏感分类问题的错分代价要受到一定的限制,详细分析并给出了具体的限制条件.简化该算法得到了一种多标签AdaBoost算法和一种多类代价敏感AdaBoost算法.理论分析和实验结果均表明提出的多标签代价敏感分类集成学习算法是有效的,该算法能实现平均错分代价的最小化.特别地,对于不同类错分代价相差较大的多分类问题,该算法的效果明显好于已有的多类代价敏感AdaBoost算法. Although a multi-label classification problem can be converted into a multi-class classification problem to solve, it is dimcult that a multi-label cost-sensitive classification problem is converted into a multi-class cost-sensitive classification problem. A cost-sensitive ensemble learning algorithm for multi-label classification problems is proposed based on the analysis on the problems encountered when the multi-class cost-sensitive learning algorithm being extended to multi-label cost-sensitive learning algorithms. The average misclassification cost of the algorithm is composed of fall-out cost and the omission cost. The new algorithmts process is similar to the adaptive boosting (AdaBoost)algorithm, and the algorithm can automatically learn some weak classifiers and combine them into a strong classifier, and the average misclassification cost of the strong classifier will decrease as the weak classifiers gradually increase. The distinction between the cost- sensitive ensemble learning algorithm for multi-label classification problems and the cost-sensitive AdaBoost algorithm for multi-class classification problems is analyzed in detail, including the basis of output label and the meaning of the misclassification cost. Unlike general multi-class cost-sensitive classification problems, the misclassification cost of the multi-label cost-sensitive classification problems are subject to certain restrictions, and the specific restrictions are given. A multi-label AdaBoost algorithm and a multi-class cost-sensitive AdaBoost algorithm can be obtained by simplifying the proposed algorithm. Theoretical analysis and experimental results show that the proposed multi-label cost-sensitive classification ensemble learning algorithm is effective, and that the algorithm can minimize the average misclassification cost. In particular, when the difference of costs of the classes is large, the proposed algorithm can get better results than the existing multi-class cost-sensitive AdaBoost algorithms.
作者 付忠良
出处 《自动化学报》 EI CSCD 北大核心 2014年第6期1075-1085,共11页 Acta Automatica Sinica
基金 四川省科技支撑计划(2011GZ0171 2012GZ0106)资助~~
关键词 多标签分类 代价敏感学习 集成学习 自适应提升算法 多分类 Multi-label classification, cost-sensitive learning, ensemble learning, adaptive boosting (AdaBoost) algorithm, multi-class classification
  • 相关文献

参考文献6

二级参考文献75

  • 1凌晓峰,SHENG Victor S..代价敏感分类器的比较研究(英文)[J].计算机学报,2007,30(8):1203-1212. 被引量:35
  • 2Elisseeff A,Weston J.A kernel method for multi-labelled classification[C] // Proceedings of Advances in Neural Information.New York:BlOwulf Technologies,2003:681-687.
  • 3Schapire R E,Singer Y.Boostexter; a boosting based system for text categorization[J].Machine Learning,2000,39(2/3):135-168.
  • 4Zhang M L,Zhou Z H.A k-nearest neighbor based algorithm for multi-label classification[C] // Proceedings of the IEEE International Conference on Granular Computing.Heidelberg;Springer Berlin,2004:718-721.
  • 5Zhu S H,Ji X,Xu W,et al.Multi-labelled classification using maximum entropy method[C] // Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development.Salvador; ACM,2004:274-281.
  • 6Trohidis K,Tsoumakas G,Kalliris G,et al.Multilabel classification of music into emotions[C] // Proceedings International Conference on Music Information Retrieval.Philadelphia; ISMIR,2008:325-330.
  • 7Tsoumakas G,Katakis I.Multi-label classification; an overview[J].International Journal of Data Warehousing and Mining,2007,3(3):1-13.
  • 8Li T,Zhang C L,Zhu S H.Empirical studies on multi-label classification[C] // Proceedings of IEEE International Conference on Tools with Artificial Intelligence.Washington DC; IEEE Computer Society,2006:86-92.
  • 9Wan S P,Xu J H.A multi-label classification algorithm based on triple class support vector machine[C] // Proceedings of IEEE International Conference on Wavelet Analysis and Pattern Recognition.Beijing; IEEE ICWAPR,2007:1 447-1 452.
  • 10Suykens J K.Least squares support vector machines for classification and nonlinear modeling[J].Neural Network World,2000,10:2948.

共引文献330

同被引文献146

引证文献23

二级引证文献251

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部