Ulansuhai Lake is the important component part of irrigation and drainage system in Hetao irrigation region of Inner Mongolia.We applied the attribute recognition method in the summer water quality evaluation of Ulans...Ulansuhai Lake is the important component part of irrigation and drainage system in Hetao irrigation region of Inner Mongolia.We applied the attribute recognition method in the summer water quality evaluation of Ulansuhai Lake and divided according to the lake situation.The water quality in every area was analyzed,and the water quality situations in Ulansuhai Lake in 2006 and 2008 summer were gained.It provided the scientific basis for the effective utilization and the pollution treatment of Ulansuhai Lake.展开更多
Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
Pedestrian attribute recognition in surveillance scenarios is still a challenging task due to the inaccurate localization of specific attributes. In this paper, we propose a novel view-attribute localization method ba...Pedestrian attribute recognition in surveillance scenarios is still a challenging task due to the inaccurate localization of specific attributes. In this paper, we propose a novel view-attribute localization method based on attention(VALA), which utilizes view information to guide the recognition process to focus on specific attributes and attention mechanism to localize specific attribute-corresponding areas. Concretely, view information is leveraged by the view prediction branch to generate four view weights that represent the confidences for attributes from different views. View weights are then delivered back to compose specific view-attributes, which will participate and supervise deep feature extraction. In order to explore the spatial location of a view-attribute, regional attention is introduced to aggregate spatial information and encode inter-channel dependencies of the view feature. Subsequently, a fine attentive attribute-specific region is localized, and regional weights for the view-attribute from different spatial locations are gained by the regional attention. The final view-attribute recognition outcome is obtained by combining the view weights with the regional weights. Experiments on three wide datasets(richly annotated pedestrian(RAP), annotated pedestrian v2(RAPv2), and PA-100 K) demonstrate the effectiveness of our approach compared with state-of-the-art methods.展开更多
Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) ...Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module(SAM). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss(WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian(RAP) and pedestrian attribute(PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.展开更多
基金Supported by National Natural Science Fund(50969005)
文摘Ulansuhai Lake is the important component part of irrigation and drainage system in Hetao irrigation region of Inner Mongolia.We applied the attribute recognition method in the summer water quality evaluation of Ulansuhai Lake and divided according to the lake situation.The water quality in every area was analyzed,and the water quality situations in Ulansuhai Lake in 2006 and 2008 summer were gained.It provided the scientific basis for the effective utilization and the pollution treatment of Ulansuhai Lake.
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
基金supported by National Key R&D Program of China(No.2018YFB1308000)Natural Science Foundation of Zhejiang province(No.LY21F 030018)。
文摘Pedestrian attribute recognition in surveillance scenarios is still a challenging task due to the inaccurate localization of specific attributes. In this paper, we propose a novel view-attribute localization method based on attention(VALA), which utilizes view information to guide the recognition process to focus on specific attributes and attention mechanism to localize specific attribute-corresponding areas. Concretely, view information is leveraged by the view prediction branch to generate four view weights that represent the confidences for attributes from different views. View weights are then delivered back to compose specific view-attributes, which will participate and supervise deep feature extraction. In order to explore the spatial location of a view-attribute, regional attention is introduced to aggregate spatial information and encode inter-channel dependencies of the view feature. Subsequently, a fine attentive attribute-specific region is localized, and regional weights for the view-attribute from different spatial locations are gained by the regional attention. The final view-attribute recognition outcome is obtained by combining the view weights with the regional weights. Experiments on three wide datasets(richly annotated pedestrian(RAP), annotated pedestrian v2(RAPv2), and PA-100 K) demonstrate the effectiveness of our approach compared with state-of-the-art methods.
基金supported by the National Natural Science Foundation of China (41874173)。
文摘Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module(SAM). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss(WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian(RAP) and pedestrian attribute(PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.