期刊文献+

基于特征选择和标签相关性的多标签分类算法 被引量:4

Multi-Label Classification Algorithm Based on Feature Selection and Label-correlation
下载PDF
导出
摘要 多标签分类在现实世界中有着广泛的应用,是当今机器学习领域的热点问题之一。多标签分类的代表性算法BR(Binary Relevance)虽有较多的改进工作,但大都仅针对标签相关性或特征选择中某一个方面进行改进,因此现有改进算法的性能仍存在提升空间。针对上述现状,论文提出一种基于特征选择和标签相关性的多标签分类算法,该算法先使用信息增益为每个标签选择与其相关的特征属性,而后采用新的控制结构的方式考虑标签相关性,最后使用新的特征集合为每个标签训练二分类器。在6个基准数据集上的实验结果表明,该算法在5种不同评价指标下的表现优于其它典型的BR改进算法。 Multi-label classification has a wide range of applications in the real world and is one of the hot issues in the field of machine learning today.Although there are many improvements on BR(Binary Relevance)which is the representative algorithm of multi-label classification,most of them only improve on one aspect of label correlation and feature selection,these existing improved algorithms still have room for improvements.In view of this,this paper proposes a multi-label classification algorithm based on feature selection and label correlation.Firstly,the algorithm uses the information gain to select the feature attributes associated with each label,then considers the label correlation in a stacking structure,and finally uses the new feature set to train the binary classifiers for each label.Experiments on six benchmark datasets show that the algorithm is superior to other typical BR improvement algorithms under five different evaluation indicators.
作者 蔡剑 牟甲鹏 余孟池 徐建 CAI Jian;MU Jiapeng;YU Mengchi;XU Jian(School of Computer Science&Engineering,Nanjing University of Science and Technology,Nanjing 210094)
出处 《计算机与数字工程》 2021年第10期1967-1972,1997,共7页 Computer & Digital Engineering
关键词 分类 多标签学习 特征选择 标签相关性 classification multi-label learning feature selection label correlation
  • 相关文献

参考文献3

二级参考文献46

  • 1Joachims T. Text categorization with support vector machines: Learning with many relevant features [C]//Proceedings of European Conference on Machine Learning. Berlin: Springer-Verlag, 1998: 137-142.
  • 2Boutell M R, Luo J, Shen X, et al. Learning multi-label scene classification [J]. Pattern Recognition, 2004, 37(9): 1757-1771.
  • 3Weiss G M, Provost F J. Learning when training data are costly: The effect of class distribution on tree induction [J]. Journal of Artificial Intelligence Research, 2003, 19:315-354.
  • 4Crammer K, Singer Y. A new family of online algorithms for category ranking [C]//Proceedings of the 25th Annual International ACM Special lnterest Group on Information Retrieval Conference on Research and Development in Information retrieval. New York: ACM Press, 2002:151-158.
  • 5Yang Y. An evaluation of statistical approaches to text categorization [J]. Information Retrieval, 1999, 1(1/2): 69-90.
  • 6Zhu S, Ji X, Xu W, et al. Multi-labelled classification using maximum entropy method [C]//Proceedings of the 28th annual international ACM Special Interest Group on Informa- tion Retrieval conference on Research and development in information retrieval. New York: ACM Press, 2005: 274-281.
  • 7Griffiths T, Ghahramani Z. Infinite latent feature models and the Indian buffet process [C]//Proeeedings of National Institute for Physiological Sciences. London: Gatsby Unit, 2005: 475-499.
  • 8Cai L, Hofmann T. Hierarchical document categorization with support vector machines [C]//Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2004: 78-87.
  • 9Rousu J, Saunders C, Szedmak S, et al. On maximum margin hierarchical multi-label classification [C]//Proceedings of National Institute for Physiological Sciences Workshop on Learning with Structured Outputs. Whistler: Canadian Center of Science and Education Press, 2004:284 -287.
  • 10Lewis D, Yang Y, Rose T, et al. RCVI: A new benchmark collect ion for text categorization research [J]. The Journal of Machine Learning Research, 2004, 5: 391-397.

共引文献36

同被引文献44

引证文献4

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部