摘要
标记分布是一种新的机器学习范式,能很好地解决某些标记多义性问题,可看作多标记的泛化。传统的单标记学习和多标记学习均可看作标记分布学习的特例。已有的标记分布学习算法中,基于算法改造的AA-KNN(Algorithm Adaptation-KNN)是一种高效的算法,但任何涉及K近邻求解问题的算法在处理不同数据集时,参数K值的选取都是一个难题,不同的K值得到的结果明显不同。基于此,将自然最近邻居的概念引入标记分布学习,提出一种新的标记分布学习方法。对数据集使用自然最近邻居搜索算法查找每个样本的自然邻居,取自然邻居的标记分布均值作为预测结果。搜索算法不需要人工设置任何参数,同时搜索算法是一种被动搜索,其自适应计算得到每个样本的邻居。在6个数据集上使用6个评价指标进行实验,结果表明,与AA-KNN相比,结合自然最近邻居的标记分布学习算法不仅避免了人工设置参数的问题,而且取得了更优的效果。
Label distribution is a new machine learning paradigm which can solve some markup ambiguity problems well.Label distribution can be seen as the generalization of multi-label.Traditional single-label learning and multi-label learning can be seen as special cases of label distribution learning.AA-KNN based on algorithm adaptive is an effective algorithm,but it is diffcult to choose an appropriate parameter K which affects the perfomence when KNN is used.So,Natural neighbors is introduced into LDL and a new label distribution learning algorithm is proposed.It finds natural neighbors of each object by searching algorithm,and then gets the average of labels of these neighbours as the predicted result.The natural neighbours searching algorithm does not need any parameter and is passive so that neighbors of each object is decided automatically.Experiments was conducted on 6 data sets and 6 evaluation indexes.The experiments show that the proposed algorithm not only solves the problem of choosing parameter K,but also improves the performance compared with AA-KNN.
作者
姚成亮
朱庆生
YAO Cheng-liang;ZHU Qing-sheng(Chongqing Key Lab of Software Theory and Technology,College of Computer Science,Chongqing University,Chongqing 400044,China)
出处
《计算机科学》
CSCD
北大核心
2020年第8期132-136,共5页
Computer Science
基金
国家自然科学基金(61802360)
重庆市科技项目(KJZH7104)
关键词
标记分布
标记分布学习
自然邻居
无参数
Label distribution
Label distribution learning
Natural neighbors
No-parameter