期刊文献+

基于AP聚类和互信息的弱标记特征选择方法 被引量:7

Weak Label Feature Selection Method Based on AP Clustering and Mutual Information
下载PDF
导出
摘要 特征选择是多标记学习中重要的预处理过程.针对现有多标记分类方法没有考虑标记占比对特征和标记相关性的影响,以及不能有效处理弱标记数据等问题,提出一种基于仿射传播(affinity propagation,AP)聚类和互信息的弱标记特征选择方法.首先,在AP聚类的基础上,结合剩余标记信息和样本相似性,构建概率填补公式,预测缺失标记值,有效补齐缺失标记;然后,使用先验概率定义标记占比,结合互信息构建相关性度量,评估特征与标记集之间的相关程度;最后,设计一种弱标记特征选择算法,有效提高弱标记数据的分类性能.在6个多标记数据集上进行仿真实验,结果表明,该算法在多个指标上获得了良好的分类性能,优于当前多种相关的多标记特征选择算法,有效验证了所提算法的有效性. Feature selection is an important preprocessing process in multi-label learning.To address the issues that some multi-label classification methods do not consider the influence of the proportion of label on the correlation between features and label sets and cannot efficiently deal with weak label data,a weak label feature selection method based on affinity propagation(AP)clustering and mutual information was proposed.Firstly,to effectively fill in all missing labels,the combination of the remaining label information with the similarity of samples was performed based on AP clustering,and then a probability filling formula was constructed to predict the values of missing labels.Secondly,the prior probability was used to define the proportion of label,which was combined with mutual information to develop the correlation metric for evaluating the correlation degree between features and label sets.Finally,a weak label feature selection algorithm was designed to effectively improve the classification performance of the weak label data.The simulation experimental results and analysis under six multi-label datasets show that the algorithm achieves better classification performance on multiple metrics and is superior to many related multi-label feature selection algorithms at present.All these can verify the effectiveness of the proposed algorithm.
作者 孙林 施恩惠 司珊珊 徐久成 Sun Lin;Shi Enhui;Si Shanshan;Xu Jiucheng(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,China)
出处 《南京师大学报(自然科学版)》 CAS CSCD 北大核心 2022年第3期108-115,共8页 Journal of Nanjing Normal University(Natural Science Edition)
基金 国家自然科学基金项目(62076089、61772176、61976082) 河南省科技攻关项目(212102210136)。
关键词 多标记学习 特征选择 AP聚类 互信息 缺失标记 multi-label learning feature selection AP clustering mutual information missing labels
  • 相关文献

参考文献9

二级参考文献71

共引文献109

同被引文献61

引证文献7

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部