摘要
多标记学习考虑单个样例与多个类别标记相关联的情况,类别不平衡主要研究样本不均衡带给算法的影响,两者均是当前机器学习研究领域的热点。在多标记数据集中普遍存在类别不平衡现象,虽然目前已经提出了大量的多标记学习,但对于数据集的内在特点却鲜有研究。针对这一问题,提出了一种基于粒子群的多标记阈值自适应极限学习机算法(MLTA-ELM)。该算法充分结合了极限学习机学习速度快、泛化性能好的优点及类别不平衡学习中的阈值自适应选择策略。首先利用极限学习机构建一个单隐层前馈神经网络模型,其次利用该模型实现多标记初步预测,然后采用粒子群优化算法作为阈值自适应选择策略,以此获得判断标记类别的最优阈值组合。最后,通过12个基准的多标记数据集,对MLTA-ELM算法的可行性及有效性进行了验证。实验结果表明,该算法与其他几种流行的方法相比,具有更好的预测能力。
Multi-label learning investigates the case of single object related to multiple labels,while class imbalanced learning mainly studies the impact of unbalancedly distrubuted samples on the algorithm.Both of them are the hot spots in the field of machine learning research.Class imbalance is the common phenomenon in multi-label datasets.Though a large number of multi label learning algorithms have been put forward,there is little research on the intrinsic characteristics of dataset.To address the problem,we present a PSO-based multi-label threshold adaptation extreme learning machine (MLTA-ELM).This algorithm fully combines the advantages of extreme learning machine such as fast learning speed,strong generalization and the adaptive selection strategy of threshold value in class unbalance learning.First,a single hidden layer feed forward neural network is built by extreme learning machine,and the multi labels are predicted preliminarily by this model.Then the particle swarm optimization algorithm is taken as the threshold adaptive selection strategy to obtain the optimal threshold combination for label prediction.Lastly,we conduct experiments on 12 baseline multi-label datasets to verify the feasibility and effectiveness of the proposed algorithm.The experiment indicates that the proposed algorithm outperforms several state- of- the- art ones.
作者
许二戗
于化龙
XU Er-qiang;YU Hua-long(School of Computer,Jiangsu University of Science and Technology,Zhenjiang 212003,China)
出处
《计算机技术与发展》
2019年第4期47-52,共6页
Computer Technology and Development
基金
国家自然科学基金(61305058
61572242)
中国博士后特别资助计划项目(2015T80481)
中国博士后科学基金(2013M540404)
江苏省自然科学基金(BK20130471)
江苏省博士后基金(1401037B)
关键词
多标记分类
类别不平衡
粒子群优化
极限学习机
阈值技术
multi-label classification
class imbalance
particle swarm optimization
extreme learning machine
threshold technique