期刊文献+

基于线性分配的难负样本挖掘度量学习

Hard-negative sample mining for metric learning based on linear assignment
下载PDF
导出
摘要 科学家依靠鲸鱼尾巴的形状及其独特的标记来识别鲸鱼的种类,但靠人眼识别和手工标注的过程非常繁琐。而且鲸鱼尾巴照片数据集存在数据分布不均衡的特点,其中个别种类样本数量极少,甚至仅有一份;同时样本个体差异较小,并且包含未知类别,导致以图像分类的方式完成鲸鱼身份的自动标注存在困难。为解决度量学习在该任务下难以分类的问题,在孪生神经网络(SNN)的基础上,利用线性分配问题(LAP)算法进行难负样本挖掘训练过程从而动态地构筑训练批次。首先对训练样本提取图像特征向量,并计算特征向量的相似性度量;然后通过LAP为模型分配样本对,根据度量分数矩阵动态地构筑训练样本批次,针对性地训练困难样本对。在一个数据分布不平衡的鲸鱼尾巴图像数据集和CUB-200-2001数据集上得到的实验结果表明,所提算法在少数类学习和细粒度图像分类上能取得良好的效果。 Scientists identify the species of whales based on the shape and the distinctive marks of the whale tails,but the process of recognition by human eyes and manual labeling is very cumbersome.The dataset of whale tail photo has the unbalanced data distribution,and some specific categories in the dataset have very few samples or even one sample.Besides,the samples have small individual differences and contain unknown categories,which leads to the difficulty in automatic labeling of whale identification by image classification.To solve the problem that metric learning is difficult to realize classification under this task,on the basis of Siamese Neural Network(SNN),the training batches were constructed dynamically by using Linear Assignment Problem(LAP)algorithm in the training process of hard-negative sample mining.Firstly,image feature vectors were extracted from the training samples,and the similarity metric of feature vector was calculated.Then,LAP was used to assign sample pairs to the model,training sample batches were constructed dynamically according to the metric score matrix,and the difficult sample pairs were targeted by trained.Experimental results on a whale tail image dataset with unbalanced data distribution and CUB 200-2001 dataset show that,the proposed algorithm can achieve good results in learning minority classes and classifying fine-grained images.
作者 傅泰铭 陈燕 李陶深 FU Taiming;CHEN Yan;LI Taoshen(College of Computer,Electronics and Information,Guangxi University,Nanning Guangxi 530004,China)
出处 《计算机应用》 CSCD 北大核心 2020年第2期352-357,共6页 journal of Computer Applications
基金 国家自然科学基金资助项目(61762008) 广西重点研发计划项目(AB17195014)~~
关键词 线性分配 难负样本挖掘 度量学习 细粒度图像识别 孪生神经网络 linear assignment hard-negative sample mining metric learning fine-grained image recognition Siamese Neural Network(SNN)
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部