期刊文献+

一种基于支持向量机的主动度量学习算法 被引量:2

Active Metric Learning Based on Support Vector Machines
下载PDF
导出
摘要 度量学习是机器学习领域的重要研究内容。度量结果的优劣会直接影响后续机器学习算法的性能。目前大多度量学习的研究工作都是在有监督环境下进行的。然而,实际应用中往往存在大量数据没有标记或需要付出昂贵代价才能获得标记的问题。针对这一问题,提出一种适用于半监督环境的基于支持向量机的主动度量学习算法(ASVM;L)。首先,从待学习无标记样本中随机选择少量样本交予专家标注,再利用这些样本训练支持向量机度量学习器;然后,根据度量学习结果,采用不同K近邻分类器对剩余未标记样本进行分类评估,选择表决差异最大的样本交予专家标注,再加入训练集重新进行度量学习;重复执行上述步骤至满足终止条件,以保证在有限的标记样本子集下能获得最佳的度量学习矩阵。在标准数据集上的对比实验验证了所提ASVM;L算法能在不影响分类精度的前提下,利用最少的标记样本获得更多的标记信息,因而具有更好的度量性能。 Metric learning is an important issue in machine learning.The measuring results will significantly affect the perfor-mance of machine learning algorithms.Current researches on metric learning mainly focus on supervised learning problems.How-ever,in real world applications,there is a large amount of data that has no label or needs to pay a high price to get labels.To handle this problem,this paper proposes an active metric learning algorithm based on support vector machines(ASVM~2L),which can be used for semi-supervised learning.Firstly,a small size of samples randomly selected from the unlabeled dataset are labeled by oracles,and then these samples are used to train the support vector machine metric learner(SVM~2L).According to the output measuring result,the rest unlabeled samples are classified by K-NN classifiers with different values of K,and the sample with the largest voting differences is selected and submitted to the oracle to get a label.Then,the sample is added to the training set to retrain the ASVM~2L model.Repeating the above steps until the termination condition is met,then the best metric matrix can be obtained from the limited labeled samples.Comparative experiments on the standard datasets verify that the proposed ASVM~2L algorithm can obtain more information with the least labeled samples without affecting the classification accuracy,and therefore has better measuring performance.
作者 侯夏晔 陈海燕 张兵 袁立罡 贾亦真 HOU Xia-ye;CHEN Hai-yan;ZHANG Bing;YUAN Li-gang;JIA Yi-zhen(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;College of Civil Aviation,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210023,China)
出处 《计算机科学》 CSCD 北大核心 2022年第S01期113-118,共6页 Computer Science
基金 国家自然科学基金(61501229) 中央高校基本科研业务费专项基金(NS2019054,NS2020045)。
关键词 度量学习 支持向量机度量学习 半监督学习 主动学习 采样策略 Metric Learning Support vector machine metric learning Semi-supervised learning Active learning Sampling strategy
  • 相关文献

参考文献1

二级参考文献2

共引文献105

同被引文献29

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部