期刊文献+

基于多特征融合的多尺度服装图像精准化检索 被引量:14

Accurate Retrieval of Multi scale Clothing Images Based on Multi feature Fusion
下载PDF
导出
摘要 为了充分挖掘服装图像从全局到局部的多级尺度特征,同时发挥深度学习与传统特征各自在提取服装图像深层语义特征和底层特征上的优势,从而实现聚焦服装本身与服装全面特征的提取,提出基于多特征融合的多尺度服装图像精准化检索算法.首先,为了不同类型特征的有效融合,本文设计了基于特征相似性的融合公式FSF(Feature Similarity Fusion).其次,基于YOLOv3模型同时提取服装全局、主体和款式部件区域构成三级尺度图像,极大减弱背景等干扰因素的影响,聚焦服装本身.之后全局、主体和款式部件三级尺度图像分别送入三路卷积神经网络(Convolutional Neural Network,CNN)进行特征提取,每路CNN均依次进行过服装款式属性分类训练和度量学习训练,分别提高了CNN对服装款式属性特征的提取能力,以及对不同服装图像特征的辨识能力.提取的三路CNN特征使用FSF公式进行特征融合,得到的多尺度CNN融合特征则包含了服装图像从全局到主体,再到款式部件的全面特征.然后,加入款式属性预测优化特征间欧氏距离,同时抑制语义漂移,得到初步检索结果.最后,由于底层特征可以很好的对CNN提取的深层语义特征进行补充,故引入传统特征对初步检索结果的纹理、颜色等特征进行约束,通过FSF公式将多尺度CNN融合特征与传统特征相结合,进一步优化初步检索结果的排序.实验结果表明,该算法可以实现对服装从全局到款式部件区域多尺度CNN特征的充分提取,同时结合传统特征有效优化排序结果,提升检索准确率.在返回Top-20的实验中,相比于FashionNet模型准确率提升了16.4%." In order to fully exploit the multi-scale features of clothing image from global to local,at the same time,the advantages of deep learning in extracting the deep semantic features of clothing images,and the advantages of traditional features in extracting the low-level features of clothing images should be given full play.So as to achieve the focus on clothing itself,and the extraction of comprehensive clothing features,an accurate retrieval of multi-scale clothing images based on multifeature fusion algorithm was proposed.Firstly,in order to effectively fuse different types of features,this paper designed a fusion formula FSF(Feature Similarity Fusion)based on feature similarity.Secondly,using annotation tools LableImg to annotate data set,and training YOLOv3 model,based on YOLOv3 model after training,three-level scale images of the global,main body and style parts were extracted simultaneously,the global area retained the whole clothing and human body area for background removal;The main body area was the upper-body clothing or lower-body clothing area to be retrieved,excluding the non-retrieved clothing area;The style parts area was collar,sleeve and other local areas.The influence of background and other interference factors are weakened in threelevel scale images to varying degrees,and focus on the clothing itself.After that,the three-level scale images of the global,main body and style parts were respectively sent to three Convolutional Neural Network(CNN)for feature extraction,each CNN has successively carried out clothing style attribute classification training and metric learning training,classification training of clothing style attributes improves CNN's ability to extract clothing style attributes,the metric learning by reducing the distance between features of the same category and increase the distance between features of different categories,so as to further improves the ability of different clothing image feature identification.The CNN features of three-level scale image extracted by three-channel CNN were fused using FSF formula,and the multi-scale CNN fusion features includes the overall features of the clothing image from the global to the main body,then to the style parts.Then,the style attributes prediction was added to optimize the Euclidean distance between features,in the case that the prediction results of 5 types of style attributes were more identical,the Euclidean distance between clothing images was reduced proportionally;If the prediction results of 5 types of style attributes were less the same,the Euclidean distance between clothing images should be increased proportionally.while restraining the semantic drift,and the preliminary retrieval results were obtained.Finally,because the low-level features can complement the deep semantic features extracted by CNN well,traditional features were introduced to restrict the texture and color features of the preliminary retrieval results.Through FSF formula,multi-scale CNN fusion features were combined with traditional features to further optimize the ranking of the preliminary retrieval results.The experimental results show that the algorithm can fully extract the multi-scale CNN features from the global to the style parts area,and at the same time,combine the traditional features to effectively optimize the sorting results and improve the retrieval accuracy.In the experiment of returning to Top 20,the accuracy was improved by 16.4%compared with the FashionNet model.
作者 王志伟 普园媛 王鑫 赵征鹏 徐丹 钱文华 WANG Zhi-Wei;PU Yuan-Yuan;WANG Xin;ZHAO Zheng-Peng;XU Dan;QIAN Wen-Hua(School of Information Science and Engineering,Yunnan University,Kunming 650504;Internet of Things Technology and Application Key Laboratory of Universities in Yunnan,Kunming 650504)
出处 《计算机学报》 EI CSCD 北大核心 2020年第4期740-754,共15页 Chinese Journal of Computers
基金 国家自然科学基金(61163019,61271361,61462093,61761046,U1802271) 云南省科技厅项目(2014FA021,2018FB100) 云南省教育厅科学研究项目(2019Y0004,2018JS011,2016CYH03)资助。
关键词 服装图像检索 多尺度 多标签学习 度量学习 特征相似性融合 clothing image retrieval multi-scale multi-label learning metric learning feature similarity fusion
  • 相关文献

参考文献9

二级参考文献72

  • 1朱赟,吴炜.图像分类中变形决策树的应用[J].计算机工程与应用,2004,40(21):90-91. 被引量:1
  • 2窦建军,文俊,刘重庆.基于颜色直方图的图像检索技术[J].红外与激光工程,2005,34(1):84-88. 被引量:39
  • 3孟繁杰,郭宝龙.一种基于兴趣点颜色及空间分布的图像检索方法[J].西安电子科技大学学报,2005,32(2):256-259. 被引量:25
  • 4王波,姚宏宇,李弼程.一种有效的基于灰度共生矩阵的图像检索方法[J].武汉大学学报(信息科学版),2006,31(9):761-764. 被引量:20
  • 5Chang S K, Hsu A. Image information system: Where do we go from here? IEEE Transactions on Knowledge and Date Engineering, 1992, 4(5): 431-442.
  • 6Niblack W, Barber R, Equitz W, et al. The QBIC project: Querying images by content, using color, texture and shape//Proceedings of the SPIE Storage and Retrieval for Image and Video Databases. San Jose, USA, 1993:173-187.
  • 7Bach J, Fuller C, Gupta A. Virage image search engine: An open framework for image management//Proceedings of the SPIE Conference on Storage and Retrieval for Image and Vid- eo Databases IV. San Jose, USA, 1996:76-87.
  • 8Panigrahy Rina. Entropy based nearest neighbor search in high dimensions//Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms. Miami, USA, 2006.. 1186-1195.
  • 9Berchtold S, Ertl K B, Kriegel H P. The pyramid-technique: Towards breaking the curse of dimensionality//Proceedings of the 1998 ACM SIGMOD International Conference on Man- agement of data. Washington, USA, 1998:142-153.
  • 10Jegou Herve, Douze Matthifs, Schmid Cordelia. Aggregating local descriptors into a compact image representation//Pro- eeedings of the IEEE 23rd Conference on Computer Vision and Pattern Recognition. San Francisco, USA, 2010: 3304- 3311.

共引文献117

同被引文献98

引证文献14

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部