摘要
基于卷积神经网络的自监督表示学习因可处理无标签数据而在近来获得广泛应用,然而由于其较慢的收敛速度和较差的细节提取能力导致计算量大和性能较低。为解决上述问题,提出一种基于乐高采样策略的有效且高效的自监督表示学习方法。该方法首先引入乐高采样策略,通过在原始图像中采样较小补丁以增加样本数量,同时使用较小补丁拼接的图像以维持较少的计算量,其次引入了一个局部细节对比分支来平衡局部细节特征和全局语义特征之间的关系,最后使用多种损失函数共同优化模型。在CIFAR和3个细粒度分类数据集上对比验证了方法的有效性。实验结果表明,本方法较MoCo等方法能够获取更为全面的全局信息和细节信息,并且在下游线性分类、目标检测等任务中获得更好的分类准确率和检测平均精度,且具有较好的视觉效果。
Convolutional neural network(CNN)-based self-supervised visual representation learning methods has been widely used recently due to its ability to handle unlabeled data.However,the slow convergence speed and the inability to obtain detailed characteristics leads to high consumption and poor performance.To address above issues,a self-supervised method that is both efficient and effective was proposed.Specifically,first the Lego-block style sampling was introduced to construct samples to increase the number of samples,and employ small patches to increase the number of samples while maintaining small amount of calculation in the meantime.Besides,an information retainer projection head(IRPH)is introduced to further balance the information between detailed inconsistency and semantic consistency.At last,multiple loss functions were user to jointly optimize the model.The effectiveness of our method is verified on CIFAR and three fine-grained classification datasets.Experiments demonstrate that this method can obtain more global and detail information than Mo Co and other methods,and can achieve better results with linear classification accuracy and detection AP in downstream tasks,and have better visual results.
作者
许亦博
赵文义
李灵巧
杨辉华
XU Yibo;ZHAO Wenyi;LI Lingqiao;YANG Huihua(School of Computer and Information Security,Guilin University of Electronic Technology,Guilin 541004,China;College of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处
《桂林电子科技大学学报》
2023年第3期181-186,共6页
Journal of Guilin University of Electronic Technology
基金
国家自然科学基金(61906050)。
关键词
自监督学习
无标签数据
乐高采样
全局特征
局部特征
self-supervised learning
unlabeled data
Lego sampling
global features
local features