摘要
基于标记特征的多标记分类算法通过对标记的正反样例集合进行聚类,计算样例与聚类中心间的距离构造样例针对标记的特征子集,并生成新的训练集,在新的训练集上利用传统的二分类器进行分类。算法在构造特征子集的过程中采用等权重方式,忽略了样例之间的相关性。提出了一种改进的多标记分类算法,通过加权方式使生成的特征子集更加准确,有助于提高样例的分类精度。实验表明改进的算法性能优于其他常用的多标记分类算法。
Multi-label learning with label specific features conducts clustering analysis on the label' s positive and negative in- stances, and then features being specific to labels are constructed by computing the distance between the instance and the cluster- ing centers.New training sets are generated based on the label-specific features and the classification model is induced by the tra- ditional binary learner.But the feature sets are generated by using the method of equal weight for each instance, it ignores the rel- evance among instances.This paper proposes a modified algorithm to solve the multi-label learning problem. It results in exact feature sets by weighting instances.Experimental results show that the modified algorithm works better than other commonly used multi-label algorithms.
出处
《计算机工程与应用》
CSCD
2013年第22期163-166,共4页
Computer Engineering and Applications
基金
国家自然科学基金(No.61170145)
教育部高等学校博士点专项基金(No.20113704110001)
山东省自然科学基金和科技攻关计划项目(No.ZR2010FM021
No.2008B0026
No.2010G0020115)
山东省分布式新技术重点实验室的资助
关键词
分类
聚类中心
加权
多标记学习
classification
clustering center
weighting
multi-label learning