Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is design...In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.展开更多
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
基金supported by the Fundamental Research Funds for the Central Universities(NOS.NS2019054,NS2020045)。
文摘In order to improve the recognition accuracy of similar weather scenarios(SWSs)in terminal area,a recognition model for SWS based on contrastive learning(SWS-CL)is proposed.Firstly,a data augmentation method is designed to improve the number and quality of weather scenarios samples according to the characteristics of convective weather images.Secondly,in the pre-trained recognition model of SWS-CL,a loss function is formulated to minimize the distance between the anchor and positive samples,and maximize the distance between the anchor and the negative samples in the latent space.Finally,the pre-trained SWS-CL model is fine-tuned with labeled samples to improve the recognition accuracy of SWS.The comparative experiments on the weather images of Guangzhou terminal area show that the proposed data augmentation method can effectively improve the quality of weather image dataset,and the proposed SWS-CL model can achieve satisfactory recognition accuracy.It is also verified that the fine-tuned SWS-CL model has obvious advantages in datasets with sparse labels.