期刊文献+

粗糙互信息的不平衡多标记特征选择算法 被引量:3

Feature Selection Algorithm with Imbalanced Multi-labels Based on Rough Mutual Information
下载PDF
导出
摘要 特征选择作为处理多标记学习中数据高维性的一种有效方法,得到了众多学者的研究与关注。由于部分特征仅仅与某些标记有着强相关性而与整个标记空间的相关性不强,不能简单通过与标记空间整体的相关性判断取舍。此外,多标记的分布是不平衡的。因此,根据标记密度对标记空间进行划分,并分别进行相关性的判断,同时在不同标记空间进行不同比例的采样。引入具有补的性质的粗糙熵代替传统熵的度量方式,提出了基于粗糙互信息的不平衡多标记特征选择算法,在5个公开数据集上的实验结果表明了算法的有效性。 Feature selection,as an effective method to deal with the high dimensionality of data in multi-label learning,has attracted attention wiclely.Since some features only have strong correlations with certain labels but not with the entire label space,it is not possible to simply make a decision through the correlation.In addition,the distribution of multi-label is imbalanced,so the label space is divided according to the density of the labels and the correlation is judged separately.Meanwhile,different proportions of sampling are performed in different label spaces.The rough entropy with complementary properties is also introduced instead of the traditional entropy measurement method,and the feature selection algorithm with imbalanced multi-labels based on rough mutual information is proposed.Experimental results on five public data sets show the effectiveness of the algorithm.
作者 陈飞 史金成 CHEN Fei;SHI Jincheng(School of Mathematics and Computer Science,Tongling University,Tongling 244061,China)
出处 《安庆师范大学学报(自然科学版)》 2021年第1期40-43,58,共5页 Journal of Anqing Normal University(Natural Science Edition)
基金 铜陵学院校级项目(2019tlxy35)。
关键词 机器学习 特征选择 粗糙互信息 不平衡性 多标记学习 machine learning feature selection rough mutual information imbalance label multi-label learning
  • 相关文献

参考文献7

二级参考文献59

  • 1Tsoumakas G, Katakis I, Vlahavas I. Data Mining and Knowledge Discovery Handbook [M]. Berlin: Springer, 2010:667-685.
  • 2Zhang Y, Zhou Z H. Multi label dimensionality reduction via dependence maximization [C] // Proe of the 2Srd AAAI Conf on Artificial Intelligence and the 20th Innovative Applications of Artificial Intelligence Conference. Menlo Park~ American Association for Artificial Intelligence, 2008: 150:3-1505.
  • 3Li G Z, You M, Ge L, et al. Feature selection for semi- supervised multi label learning with application to gene function analysis [C] // Proc of the 2010 ACM Int Conf on Bioinformatics and Computational Biology. New York: Association for Computing Machinery, 2010:354-357.
  • 4You M Y, Liu J M, Li G Z, et al. Embedded feature selection for multi-label classification of music emotions [J]. International Journal of Computational Intelligence Systems, 2012, 5(4): 668-678.
  • 5Shao H. H G. l.iu G, et al. lahel data of inquiry diagnosis Symptom selection for multi n traditional Chinese medicioe [J]. Science China Information Sciences, 2012, 54(1): 1-13.
  • 6Lee J, I.im H, Kim D W. Approximating mutual information for multi label feature selection [J].Electronics Le'tters, 2012, 48(15): 929-930.
  • 7Zhang M I., Pena J M, Rohles V. Feature selection for muhi-lahel naive Bayes classification [J].Information Seienees, 2009, 179( 19): 3218-3229.
  • 8Park C H, Lee M.On applying linear discriminant analysis for multi-labeled problems [J]. Pattern Recognition I.etters, 2008, 29(7) : 878-887.
  • 9Yu K. Yu S, Tresp V. Multi label informed latent semantic indexing[C]/ Proc of the 28th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2005:258-265.
  • 10Ji S, Ye J. Linear dimensionality reduction for multi label classification [C] // Proe of the 21st Int Joint Conf on Artifieial Intelligence. San Francisco: Morgan Kaufmann, 2009:1077-1082.

共引文献147

同被引文献28

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部