期刊文献+

基于视点变换的酒标图像数据增强研究

On Wine Label Image Data Augmentation Through Viewpoint Based Transformation
下载PDF
导出
摘要 随着人民生活水平的提高和红酒文化的发展,建立一个高效的自动化酒标图像检索系统变得越来越重要。然而,实际的酒标图像数据集普遍存在着类别样本量的不均衡、许多类样本量偏少的现象,使得基于深度学习的酒标图像检索模型难以进行有效的训练和参数学习。因此,对酒标图像进行数据增强操作就变得更为必要和迫切。为了解决这个问题,本文提出了一个专门针对于酒标图像数据进行变换和扩展的数据增强算法。它将酒标以立体的形式展示在圆柱体酒瓶的表面并通过一个拍摄视点投影到柱面切平面而形成了酒标图像。这样便可通过一幅图像对酒标进行柱面建模,并通过对视点的上下,左右,远近移动来对柱面酒标进行投影变换而生成新的酒标图像。通过在大规模的酒标图像数据集上的实验结果表明,本文所提出的基于视点变换的数据增强策略能够有效地实现对酒标图像数据的扩展,并且显著提高了酒标图像检索模型的检索能力。 With the improvement of people’s living standard and the development of red wine culture,it has become more and more important to establish an efficient automated wine label image retrieval system.However,the classes of wine label images are unbalanced and some classes are a few number of images so that the wine label image retrieval model based on deep learning is difficult to train.Therefore,data augmentation for wine label image becomes more necessary and urgent.In order to solve this problem,we propose a specialized data augmentation algorithm for wine label image.Specifically,we consider the wine label on the wine bottle as a cylinder and project it on the plane being tangent with the cylinder from a viewpoint to form the wine label image.In this way,we can make the cylinder modeling or reconstruction from a wine label image,and move the viewpoint up and down,left and right,far and near,to generate a new projection wine label image from the cylinder wine label with the viewpoint transformation.Experimental results on a large-scale wine label image dataset show that this viewpoint transformation-based data augmentation strategy can effectively increase the number of essentially different images of the same wine label,and significantly improve the retrieval ability of the wine label retrieval model.
作者 李晓晴 张孝昌 才子嘉 马尽文 LI Xiaoqing;ZHANG Xiaochang;CAI Zijia;MA Jinwen(Department of Information Science,School of Mathematical Sciences,Peking University,Beijing 100871,China;The Institute of Military Medical Research,Beijing 100850,China)
出处 《信号处理》 CSCD 北大核心 2022年第1期43-54,共12页 Journal of Signal Processing
基金 科技部国家重点研发计划项目《科技创新2030-“新一代人工智能”重大项目》课题“神经网络的可解释性”(2018AAA0100205)。
关键词 酒标图像 深度学习 数据增强 视点变换 柱面建模 wine label image deep learning data augmentation viewpoint transformation cylinder modeling
  • 相关文献

参考文献3

二级参考文献4

共引文献27

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部