摘要
提出一种过滤式的多标记数据特征选择框架,并在卡方检验基础上进行实现和实验研究。该框架计算每个特征在各个类标上的卡方检验,然后通过得分的统计值计算出每个特征的最终排序情况,选取了最大、平均、最小3种统计值分别进行了实验比较。在5个评价指标、4个常用的多标记数据集和3个学习器上的对比实验表明,3种得分统计方式各有优劣,但都能提高多标记学习的效果。
The researchers of multi-label learning mainly focus on the classifier performance,regardless of the influence of the dataset feature. This paper proposes a filter framework of the multi-labeled data feature selection. The algorithm implementation and experiment were carried out based on the Chi-square test. This framework calculates the CHI-square test for each feature on each label,and then the ranking order of each feature is computed by the statistics of the score. This paper considers three different types of statistical data( average,maximum,minimum)for the experimental comparisons. The contrasting experiments with the four common multi-label datasets with three classifiers and five evaluation criteria show that these three score statistical methods share both superior and inferior characteristics,but still improve the performance for multi-label learning problems.
出处
《智能系统学报》
CSCD
北大核心
2014年第3期292-297,共6页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金资助项目(61273305)
关键词
特征选择
多标记
过滤式
卡方检验
feature selection
multi-label
filter
CHI-square test