摘要
在多标记分类问题当中,多标记分类器的目的是为实例预测一个与其关联的标记集合。典型方法之一是将多标记分类问题转化为多个二类分类问题,这些二类分类器之间可以存在一定的关系。简单地考虑标记间依赖关系可以在一定程度上改善分类性能,但同时计算复杂度也是必须考虑的问题。该文提出了一种利用多标记间依赖关系的有序分类器集合算法,该算法通过启发式的搜索策略寻找分类器之间的某种次序,这种次序可以更好地反映标记间的依赖关系。在实验中,该文选取了来自不同领域的数据集和多个评价指标,实验结果表明该文所提出的算法比一般多标记分类算法具有更好的分类性能。
In the domain of multi-label classification, the goal of a multi-label classifier is to assign a set of labels to an instance. One of the classical methods is to transform a multi-label classification problem to several traditional bi- nary classification problems, and thus some relations may exist among these binary classifiers. Simply taking label dependency into account can improve the classification performance to a certain extant, but it is also necessary to consider the computational complexity. This paper proposes an ordered ensemble of classifiers algorithm, which se- lected a proper order for classifiers using a heuristic search strategyfor better use of the label dependency. In the ex- periment, a broad range of multi-label datasets and a variety of evaluation metrics are used, and experiment result shows that the proposed method outperforms some state-of-the-art methods.
出处
《中文信息学报》
CSCD
北大核心
2013年第4期119-126,共8页
Journal of Chinese Information Processing
基金
中央高校基本科研业务费专项资金(2011YJS223)
关键词
多标记分类
文本分类
数据挖掘
multi-label classification
text categorization
data mining