期刊文献+

基于多标签数据的降维与分类算法的研究

Research on Dimension Reduction and Classification Algorithm Based on Multi Label Data
下载PDF
导出
摘要 现在为人们所熟知的是单标签的分类,传统的监督学习的方法主要应用在单标签的数据中,但随着数据的日益丰富,单标签已经不能再完整地描述一个样本的信息,现在往往一条样本会对应多个标签,所以多标签数据的分类逐渐的成为数据挖掘的一个重要研究方向。虽然多标签能够更好地去描述一个样本的信息,但多标签数据通常是那种特征数目很大的数据,对这样的数据直接进行处理很困难,同时这些高维数据往往存在维度灾难的问题,所以对多标签数据进行分类之前做好数据的降维对最终的分类起着不可忽视的作用。提出一种基于采用条件互信息(最小冗余最大依赖准则,MDMR)来进行特征集的选择,去除无用的特征信息,然后通过一种改进的KNN算法对数据进行分类,实验表明这种方法使平均查全率提高2.5%。 Now,is well known that the classification of a single label,the traditional method of supervised learning are used in data in a single label,but the increasing rich data,single-label can no longer complete description of a sample of the information,a sample often can corresponds more tags todays,so multi-label classification data gradually become an important research direction of data mining.While many labels to better information to describe a sample,multi-label data is usually characterized by a large number of the kind of data,so it is difficult to process such data directly,and these high-dimensional data while there is often the curse of dimensionality problem,Data before doing so multi-label data classification dimension reduction on the final classification and plays an essential role.Presents for this condition based on the use of mutual information(Minimum Redundancy AND Maximum Dependent) to select the feature set,removing useless features information,and then through an improved KNN algorithm for data classification,experimental results show that this method is that the average recall rate increased by 2.5%.
机构地区 上海海事大学
出处 《现代计算机(中旬刊)》 2016年第5期3-9,共7页 Modern Computer
关键词 单标签 多标签 条件互信息 特征提取 KNN算法 Single-Label Multi-Label Conditional Mutual Information Feature Extraction KNN Algorithm
  • 相关文献

参考文献34

  • 1Tsoumakas G, Katakis I. Mutilabel Classification: An Overview[J]. Data Warehousing and Mining,2007,3(3):1-13.
  • 2Tsoumakes G, Katakis I, Vlahavas I. Mining Multilabel Data [M]. Data Mining and Knowledge Diseovery Handbook. New York Springer,2010.
  • 3Boutell M R, Luo Jie-Bo, Shen Xi-Peng, Et A1. Learning Multitabel Scene Classification[J]. Pattern Recognition,2004,37 (9):1757- 1771.
  • 4Zhang Yi, Burer S, Street W N. Ensemble Pruning Via Semidefinite Programming[J]. Maehine Learning Researeh,2006,7(12):1315- 1338.
  • 5Blockeel H, Schietgat L, Stmyf J, et al. Decision Trees for Hierarchical Mulilabel Classification: A Cass Study in Functional Ge- nomics[M]. New York:Spring Berlin Heidelberg, 2006.
  • 6Tsoumakas G, Vlahavas I. Random K-Labelsets: An Ensemble Method for Multilabel Classification: Machine Learning[M]. New York: Spring Berlin Heidelberg, 2007.
  • 7Zhang Min-Ling, Zhou Zhi-Hua. Mutilabel Neural Networks with Application to Functional Genomics and Text Categorization[J]. IEEE Transactions On Knowledge And Data Engineering,2006,18(10):1338-1351.
  • 8J.Lee, D.Kim, Feature Selection for Multi-Label Classification Using Multivariate Mutual Information, Pattern Recognit. Lett. 34(2013):349-359.
  • 9J.Lee, D.Kim, Mutual Information-Based Multi-Label Feature Selection Using Interaction Information, Expert Syst. Appl. 42(2015) 2013-2025.
  • 10J.Lee, D.Kim, Memetic Feature Selection Algorithm for Multi-Label Classification, Inf. Sci. 293(2015)80-96.

二级参考文献85

  • 1LingZhang,BoZhang.A Quotient Space Approximation Model of Multiresolution Signal Analysis[J].Journal of Computer Science & Technology,2005,20(1):90-94. 被引量:19
  • 2WANG Guo-yin HU Feng HUANG Hai WU Yu.A Granular Computing Model Based on Tolerance relation[J].The Journal of China Universities of Posts and Telecommunications,2005,12(3):86-90. 被引量:9
  • 3Yiyu,(Y.Y.),Yao.Three Perspectives of Granular Computing[J].南昌工程学院学报,2006,25(2):16-21. 被引量:19
  • 4Shen X,Boutell M,Luo J,Brown C.Multi-label machine learning and its application to semantic scene classification//Proceedings of the 2004 International Symposium on Electronic Imaging.San Jose,California,USA,2004:18-22.
  • 5Hullermeier E,Furnkranz J,Cheng W,Brinker K.Label ranking by learning pairwise preferences.Artificial Intelligence,2008,172(16):1897-1916.
  • 6Read J.A pruned problem transformation method for multi-label classification//Proceedings of the New Zealand Computer Science Research Student Conference.New Zealand,2008:143-150.
  • 7Tsoumakas G,Vlahavas I.Random k-labelsets:An ensemble method for multilabel classification//Proceedings of the ECML.Warsaw,Poland,2007:406-417.
  • 8Schapire R,Singer Y.BoosTexter:A boosting-based system for text categorization.Machine Learning,2000,39(2):135-168.
  • 9Zhang M,Zhou Z.Multilabel neural networks with applications to functional genomics and text categorization.IEEE Transactions on Knowledge and Data Engineering,2006,18(10):1338-1351.
  • 10Zhang M,Zhou Z.A k-nearest neighbor based algorithm for multi-label classification//Proceedings of the IEEE International Conference on Granular Computing.Beijing,China,2005,2:718-721.

共引文献203

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部