期刊文献+

一种基于相关信息熵的多标签分类算法 被引量:3

A Multi-Label Classification Algorithm Using Correlation Information Entropy
下载PDF
导出
摘要 在多标签分类中,标签之间的相关关系是一个重要的因素。为了利用标签之间的相关关系,文章提出了一种基于相关信息熵的多标签分类算法,使用相关信息熵来衡量标签之间相关关系的强弱程度。首先找出相关信息熵值最大的k标签组合的集合,然后使用LP(Label Powerset)分类器对每一个标签组合进行训练。在7个不同实验数据集上的实验结果表明:文中提出的算法的分类性能在其中的大部分数据集上优于其它对比的分类算法,而其它对比的分类算法仅在某一个数据集上优于文中提出的算法。 In our opinion, the LP( label powerset) classifier may put the uncorrelated labels into the label set and train it as a single label. To solve this problem, it is very necessary to make use of the correlations among multiple labels in carrying out multi-label classification. Therefore, we propose a multi-label classification algorithm using correlation information entropy (MLCACIE) for measuring the strength of label correlation. Its core consists of: ( 1 ) given the number of classifiers (CN) to be trained, we find out the CN number of subsets of k-labels with the strongest correlation; (2) we train these k-label subsets one by one with the CN number of LP classifiers. Finally, we use seven experimental datasets and the decision tree as the base classifier to perform experiments on the MLCA- CIE and compare it with other classification algorithms. The experimental results, given in Table 3, and their anal- ysis show preliminarily that : ( 1 ) ourMLCACIE outperforms other classification algorithms on most datasets because it makes use of the correlations among multiple labels in performing multi-label classification, while the other classi- fication algorithms outperform our MLCACIE only on one of the seven datasets; (2) the use of the correlations a- mong multiple labels can enhance the multi-label classification performance.
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2012年第6期968-973,共6页 Journal of Northwestern Polytechnical University
基金 国家科技重大专项(2012ZX03005007)资助
关键词 多标签分类 数据处理 相关信息熵 相关关系 algorithms, classification ( of information), correlationpy, information theory, labels correlation informationtheory, data processing, decision trees, entro-entropy, multi-label classification
  • 相关文献

参考文献14

  • 1Grigorios T, Ioannis V. Mining Multi-Label Data. Data Mining and Knowledge Discovery Handbook, 2010, 2nd edition.
  • 2Zhang M L, Zhou Z H. Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization. IEEE Transactions on Knowledge and Data Engineering, 2006, 18 (10) : 1338-1351.
  • 3Andre E, Jason W. A Kernel Method for Multi-Labelled Classification. Advances in Neural Information Processing Systems, 2002, 14:681-687.
  • 4Francesco D C, Remi G, Marc T. Learning Multi-Label Alternating Decision Trees from Texts and Data. Lecture Notes in Com- puter Science 2734, 2003, 35-49.
  • 5Johannes F K, Eyke H. Muhilabel Classification via Calibrated Label Ranking. Machine Learning, 2008, 73 (2) :133-153.
  • 6Ji S W, Tang L. Extracting Shared Subspaces for Multi-Label Classification. KDD 2008:14th ACM SIGKDD International Con- ference on Knowledge Discovery and Data Mining, 2008, 381-389.
  • 7Jesse R, Bernhard P. Classifier Chains for Multi-Label Classification. Machine Learning, 2011, 85 (3) :333-359.
  • 8Dembczynski K, Cheng W W. Bayes Optimal Multilabel Classification via Probabilistic Classifier Chains. Proc ICML, 2010, 279 -286.
  • 9Grigorios T, Ioannis V. Random k-Labelsets for Multilabel Classification. IEEE Transactions on Knowledge and Data Engineer- ing, 2011, 23(7) :1079-1059.
  • 10Wang Q, Shen Y, Zhang J Q. A Nonlinear Correlation Measure for Multivariable Data Set. Physica D:Nonlinear Phenomena,2005,200 (3/4) : 287 -295.

同被引文献30

  • 1Witten I H,Frank E.数据挖掘实用机器学习技术[M].北京:机械工业出版社,2006
  • 2Han J,Mickeline K,Pel J.数据挖掘:概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2012.
  • 3Kira K,Rendell L A.A practical approach to feature selection[C]//In Machine Learning Proceedings of the Ninth International Conference.San Francisco:Morgan Kaufmann,1992:250-256.
  • 4Modrzejewski M.Feature selection using rough sets theory[C]//European Conference on Machine Learning.Berlin:Springer Verlag,1993:213-226.
  • 5Liu H,Setiono R.A probabilistic approach to feature selection-a filter solution[C]//Proceedings of International Conference on Machine Learning.San Francisco:Morgan Kaufmann,1996:419-424.
  • 6Hall M A.Correlation-based feature selection for machine learning[D].Hamilton:The University of Waikato,1999.
  • 7Hall M A.Correlation-based feature selection for discrete and numeric class machine learning[C]//the 17th International Conference on Machine Learning.San Francisco:Morgan Kaufmann,2000:359-366.
  • 8UCI机器学习库[EB/OL].[2013-4-11].http://archive.ics.uci.edu/ml/datasts.html/.
  • 9Boutell M R, Luo Jiebo, Shen Xipeng, et al. Learning multi-label scene classification[J] . Pattern Recognition, 2004, 37(9):1757-1771.
  • 10Tsoumakas G, Katakis I, Vlahavas I. Mining multi-label data[M] //Maimon O, Rokach L. Data Mining and Knowledge Discovery Handbook. Berlin:Springer, 2010:667-686.

引证文献3

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部