期刊文献+

基于ReliefF剪枝的多标记分类算法 被引量:9

ReliefF Based Pruning Model for Multi-Label Classification
下载PDF
导出
摘要 多标记分类问题需要为每个实例分配多个标记.常见的多标记分类方法主要分为算法转换法和问题转换法两类.合理利用标记间的依赖关系是提升多标记分类性能的关键.在该文中,作者从不同的问题转化方法的角度,将标记间依赖关系的利用方法分为标记分组法和属性空间扩展法两种.作者发现,对于属性空间扩展法,普遍存在的难题在于如何对标记间的依赖关系进行准确度量,并选择合适的标记集合加入到属性空间中.在此基础上,作者提出了一种基于ReliefF剪枝的多标记分类算法(ReliefF based Stacking,RFS).算法从属性选择的角度,利用ReliefF方法对标记间的依赖关系进行度量,进而选择依赖关系较强的标记加入到原始属性空间中.在9个多标记基准数据集上的实验结果显示,RFS算法相较于当下流行的多标记分类算法具有较为明显的优势. Multi-label classification(MLC)is a machine learning problem in which models are sought that assign a subset of labels to each instance.MLC is receiving increased attention and is relevant to many domains such as text categorization,classification of music and videos,semantic annotation of images and many more.Recently,many studies are looking for efficient and accurate algorithms to cope with multi-label classification challenge.They are usually partitioned into two main categories:algorithm adaptation and problem transformation.In multi-label classification problem the labels will not occur independent of each other;instead,there are statistical dependencies between them.Nowadays,it is commonly accepted that exploiting dependencies between the labels is the key of improving the performance of multi-label classification problem.In this paper,we divide the utilizing methods of label dependency into two groups from the perspective of different ways of problem transformation:label grouping model and feature space extending model.Label grouping model normally groups labels into several label subsets based on certain strategies or criteria to incorporate label dependences.While feature space extending model usually extends the feature space of the binary classifiers to let them discover existing label dependence by themselves.We find out that the common difficulty for both kinds of models is how to accurately measure the dependences between labels.In particular,for feature space extending model,how to choose proper labels to extend the original feature space is the key to improve classification performance.On the basis of this,we propose a ReliefF based pruning model for multi-label classification(ReliefF based Stacking,RFS).RFS measures the dependencies between labels in a feature selection perspective,and then selects the more relative labels into the original feature space.And we use a stacking based algorithm during training and prediction.The key contribution of this algorithm is threefold:(1)It provides a new method to measure the dependences between labels.Unlike existing methods measuring pair-wise label dependences,our method related to the ReliefF algorithm takes into account the effect of all interacting labels.(2)Instead of extending the original feature space with all labels,we choose the closely related labels.Thus,we can reduce noise in the data and avoid adverse effects caused by irrelevant labels.(3)In the feature selection phase,we design a brand new strategy that treats original features and label features as the same features and select together.Our empirical study is divided into two parts:a systematic study on parameters of our algorithm and a comparative study between our proposal and other multi-label classification algorithms.The effects of parameters,feature selection strategies and base classifiers on RFS are discussed in the first part of experiments.In the second part,experiment results based on 6evaluating measures on 9multi-label benchmark datasets show that RFS is more effective compared to other advanced multi-label classification algorithms.
作者 刘海洋 王志海 张志东 LIU Hai-Yang;WANG Zhi-Hai;ZHANG Zhi-Dong(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044)
出处 《计算机学报》 EI CSCD 北大核心 2019年第3期483-496,共14页 Chinese Journal of Computers
基金 国家自然科学基金(61672086 61702030 61771058) 北京市自然科学基金(4182052)资助~~
关键词 多标记分类 标记间依赖关系 属性选择 RELIEFF Stacking算法 multi-label classification label dependence feature selection ReliefF Stacking
  • 相关文献

同被引文献101

引证文献9

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部