期刊文献+

一种多标记数据的过滤式特征选择框架 被引量:7

A filtering framework for the multi-label feature selection
下载PDF
导出
摘要 提出一种过滤式的多标记数据特征选择框架,并在卡方检验基础上进行实现和实验研究。该框架计算每个特征在各个类标上的卡方检验,然后通过得分的统计值计算出每个特征的最终排序情况,选取了最大、平均、最小3种统计值分别进行了实验比较。在5个评价指标、4个常用的多标记数据集和3个学习器上的对比实验表明,3种得分统计方式各有优劣,但都能提高多标记学习的效果。 The researchers of multi-label learning mainly focus on the classifier performance,regardless of the influence of the dataset feature. This paper proposes a filter framework of the multi-labeled data feature selection. The algorithm implementation and experiment were carried out based on the Chi-square test. This framework calculates the CHI-square test for each feature on each label,and then the ranking order of each feature is computed by the statistics of the score. This paper considers three different types of statistical data( average,maximum,minimum)for the experimental comparisons. The contrasting experiments with the four common multi-label datasets with three classifiers and five evaluation criteria show that these three score statistical methods share both superior and inferior characteristics,but still improve the performance for multi-label learning problems.
出处 《智能系统学报》 CSCD 北大核心 2014年第3期292-297,共6页 CAAI Transactions on Intelligent Systems
基金 国家自然科学基金资助项目(61273305)
关键词 特征选择 多标记 过滤式 卡方检验 feature selection multi-label filter CHI-square test
  • 相关文献

参考文献12

  • 1TSOUMAKAS G, KATAKIS l, VLAHAVAS I. Mining Multi-label Data [ R ]. Data Minging and Knowledge Dis- covery Handbook, 2010: 667-685.
  • 2TSOUMAKAS G, KATAKIS I. Multi-label classification: an overview[ J ]. International Journal of Data Wareh -ousing and Mining, 2007, 40(3) : 1-13.
  • 3ZHANG M L, ZHANG K. Multi-label learning by exploiting label dependency [ C ]//Proceedings of the 16th ACM SIG- KDD International Conference on Knowledge Discovery and Data Mining. Washington, DC, USA, 2010:999-1008.
  • 4YANG Y, PEDERSEN J O. A comparative study on feature selection in text categorization [ C ]//Machine Learning In- ternational Workshop then Conference. Philadelphia, USA, 1997 : 412-420.
  • 5SWATI S, GHATOL A, ASHOK C. Feature selection for medical diagnosis: Evaluation for cardiovascular diseases [J ]. Expert Systems with Applications, 2013, 40 (10) : 4146-4153.
  • 6NEWTON S, EVERTON A C, MARIA C M, et al. A com- parison of multi -label feature selection methods using the problem transformation apprnach [ J ]. Electronic Notes in Theoretical Computer Science, 2013, 292 : 135-15 1.
  • 7计智伟,胡珉,尹建新.特征选择算法综述[J].电子设计工程,2011,19(9):46-51. 被引量:46
  • 8邱云飞,王威,刘大有,邵良杉.基于方差的CHI特征选择方法[J].计算机应用研究,2012,29(4):1304-1306. 被引量:30
  • 9ZHANG M L, ZHOU Z H. A review on muhi-label learning algorithms[ J]. IEEE Transactions on Knowledge and Data Engineering, 2013, 39(10) : 1-43.
  • 10MATTHEW R B, LUO J B, SHEN X P, et al. l,earning multi-label scene classification [ J ]. Pattern Reeognifion, 2004, 37(9) : 1757-1771.

二级参考文献45

  • 1唐焕玲,孙建涛,陆玉昌.文本分类中结合评估函数的TEF-WA权值调整技术[J].计算机研究与发展,2005,42(1):47-53. 被引量:26
  • 2张莉,孙钢,郭军.基于K-均值聚类的无监督的特征选择方法[J].计算机应用研究,2005,22(3):23-24. 被引量:29
  • 3周宇,覃征.聚类分析中特征选择的研究[J].计算机应用研究,2006,23(5):55-57. 被引量:2
  • 4苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:386
  • 5Langley P.Seleetion of relevant features in machine learning[J].In:Proe.AAAI Fall Symposium on Relevanee,1994:140-144.
  • 6Langley P,Iba W.Average-case analysis of a nearest neighbour algorithm[C] //Proceedings of the Thirteenth International Joint Con-Ferenee on Artifieial Intelligence,1993:889-894.
  • 7Jain A,Zongker D.Feature seleetion:evaluation,application,and Sniall sample pedortnanee[J].IEEE transactions on pattern analysis and machine intelligence,1997,19(2):153-158.
  • 8Xing E,Jordan M,Karp R.Feature seleetion for high-dimensional genomic microarray data[C] //Intl.conf.on Machine Learning,2001:601-608.
  • 9Davies S,Russl S.Np-completeness of searehes for smallest Pos Sible feature sets[C] // In:Proc.Of the AAAI Fall 94Symposium on Relevanee,1994:37-39.
  • 10Narendra PM,Fukunaga K.A branch and bound algorithm for feature subset selection[J].IEEE Transactions on Computers,1997(26):917-922.

共引文献74

同被引文献65

  • 1王凯,刘玲.关于当今足球进攻组织特点的数据统计与分析[J].当代体育科技,2011,1(2):26-27. 被引量:2
  • 2王莉,赵渊,杨显明,马建民,黄韬,高宏.基于时间序列模型与残差控制图的兰州市空气质量研究[J].高原气象,2015,34(1):230-236. 被引量:12
  • 3徐凤亚,罗振声.文本自动分类中特征权重算法的改进研究[J].计算机工程与应用,2005,41(1):181-184. 被引量:56
  • 4Lanckriet GRG,Cristianini N,Bartlett P,et al.Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research . 2004
  • 5Liwei Wei,Zhenyu Chen,Jianping Li.??Evolution strategies based adaptive L p LS-SVM(J)Information Sciences . 2011 (14)
  • 6Yang Wenchuan,Chen Ningjun,Duan Xiaoyan.Research of an atypical unexpected icident in tlecom complaint text for 3G. Advances in Intelligent&Soft Computing . 2012
  • 7Feldman R,Fresko M,Kinar Y,et al.Text mining at the term level. Procedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery . 1998
  • 8Liu Wenyin,Xiaojun Quan,Min Feng,Bite Qiu.??A short text modeling method combining semantic and statistical information(J)Information Sciences . 2010 (20)
  • 9Parrott W. Emotions in social psychology: Essential readings [ M]. Psychology Press ,2001 : 1-392.
  • 10Zim C, Niepert M, Stuckenschmidt H, et al. Fine-grained senti- ment analysis with structural features [ C ]. IJCNLP, 2011 : 336 - 344.

引证文献7

二级引证文献48

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部