摘要
特征选择对于分类器的分类精度和泛化性能起重要作用。目前的多标记特征选择算法主要利用最大相关性最小冗余性准则在全部特征集中进行特征选择,没有考虑专家特征,因此多标记特征选择算法的运行时间较长、复杂度较高。实际上,在现实生活中专家依据几个或者多个关键特征就能够直接决定整体的预测方向。如果提取关注这些信息,必将减少特征选择的计算时间,甚至提升分类器性能。基于此,提出一种基于专家特征的条件互信息多标记特征选择算法。首先将专家特征与剩余的特征相联合,再利用条件互信息得出一个与标记集合相关性由强到弱的特征序列,最后通过划分子空间去除冗余性较大的特征。该算法在7个多标记数据集上进行了实验对比,结果表明该算法较其他特征选择算法有一定优势,统计假设检验与稳定性分析进一步证明了所提出算法的有效性和合理性。
Feature selection plays an important role in the classification accuracy and generalization performance of classifiers.The existing multi-label feature selection algorithms mainly use the maximum relevance and minimum redundancy criterion to perform feature selection in all feature sets without considering expert features,therefore,the multilabel feature selection algorithm has the disadvantages of long running time and high complexity.Actually,in real life,experts can directly determine the overall prediction direction based on a few or several key features.Paying attention to and extracting this information will inevitably reduce the calculation time of feature selection and even improve the performance of classifier.Based on this,a multi-label feature selection algorithm based on conditional mutual information of expert feature was proposed.Firstly,the expert features were combined with the remaining features,and then the conditional mutual information was used to obtain a feature sequence of strong to weak relativity with the label set.Finally,the subspaces were divided to remove the redundant features.The experimental comparison was performed to the proposed algorithm on 7 multilabel datasets.Experimental results show that the proposed algorithm has certain advantages over the other feature selection algorithms,and the statistical hypothesis testing and the stability analysis further illustrate the effectiveness and the rationality of the proposed algorithm.
作者
程玉胜
宋帆
王一宾
钱坤
CHENG Yusheng;SONG Fan;WANG Yibin;QIAN Kun(School of Computer and Information,Anqing Normal University,Anqing Anhui 246011,China;University Key Laboratory of Intelligent Perception and Computing of Anhui Province,Anqing Anhui 246011,China)
出处
《计算机应用》
CSCD
北大核心
2020年第2期503-509,共7页
journal of Computer Applications
基金
安徽省高校重点科研项目(KJ2017A352)
安庆师范大学科研创新团队建设计划项目~~
关键词
特征选择
专家特征
条件互信息
多标记学习
局部子空间
feature selection
expert feature
conditional mutual information
multi-label learning
local subspace