摘要
近年来虽然深度学习在服装检索领域有了很多不错的成果,但是研究者们对服装风格研究却很少。消费者往往通过自己喜欢的风格来检索搭配的服装,或者是消费者更愿意检索到与自己穿衣风格相似的服装。现有的服装风格研究者只是将服装风格进行分类,通过用户输入图像为消费者识别喜欢的风格,然而这样的检索结果只能返回与该图像风格相似的服装,而不能与输入的图像完成搭配。因此本文从服装风格的整体兼容性出发,将每件服装单品视作单词,按照Word2vec中的单词相似性概念,分别提出了基于文本的风格检索模型以及基于图像的风格检索模型。最后将获取到的两种模态信息进行特征融合,提出了一个多模态风格检索模型。实验结果表明,在Polyvore多模态数据集上,按照前人研究者的服装风格相似性评判标准,多模态融合的服装风格检索方法比单模态风格检索以及其他多模态混合风格检索方法所获取的结果列表的平均相似度更佳。
Although deep learning has achieved many good results in the field of Fashion retrieval in recent years, researchers have little research on Fashion style. Consumers often search the matching clothing through their favorite style, or consumers are more willing to search the clothing similar to their own style. The existing Fashion style researchers only classify the Fashion style and identify the favorite style for consumers through the user’s input image. However, such retrieval results can only return the clothing similar to the image style, but can not match with the input image. Therefore, starting from the overall compatibility of Fashion style, this paper regards each piece of clothing as a word, and proposes text-based style retrieval model and image-based style retrieval model respectively according to the concept of word similarity in Word2vec. Finally, a multimodal style retrieval model is proposed based on feature fusion of the two modal information obtained. The experimental results show that on the Polyvore multimodal data set, according to the previous researchers’ Fashion style similarity evaluation criteria, the multimodal fusion Fashion style retrieval method has better average similarity than the single modal style retrieval and other multimodal hybrid style retrieval methods.
出处
《计算机科学与应用》
2023年第3期492-501,共10页
Computer Science and Application