Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
The purpose of the present paper is to investigate the multimodality in college English textbooks. Because multimodality is closely associated with college English teaching and learning, it is necessary to probe into ...The purpose of the present paper is to investigate the multimodality in college English textbooks. Because multimodality is closely associated with college English teaching and learning, it is necessary to probe into the multimodality in college English books. First, the study showed that the images in college English textbooks have the following functions: to illustrate the text, to consolidate the information in the text, to aestheticize the text, as well as to collaborate with the text to establish harmony between the author and the reader, therefore to convey the author's intentions well. Second, the study showed that the multimodality in college English textbooks reflects power relation and solidarity relation between the author and the reader. On the one hand, multimodality in college English textbooks implies unequal power relationships which are represented by unequal power opportunities for the author to select images while the reader has little influence. On the other hand, the learning activities provide a lot of chances for the reader to make contact with the images, hence establishing a familiar and harmonious relationship between the reader and the author. The findings of the present study have useful implications for college English teaching and learning and imply the need for further studies in this field.展开更多
This paper seeks to examine the image and text relationship in TANG Yin's scroll of poetry and painting from three aspects: The first aspect focuses upon the schema type of its image and text relationship in physica...This paper seeks to examine the image and text relationship in TANG Yin's scroll of poetry and painting from three aspects: The first aspect focuses upon the schema type of its image and text relationship in physical form; the second aspect, explores the text's/poetry's functions of anchorage and relay while appreciating those images/paintings; the third aspect, traces the semiosis process of image, exploring how image and text as cultural products in the epistemological world mediates with the phenomenological world展开更多
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
文摘The purpose of the present paper is to investigate the multimodality in college English textbooks. Because multimodality is closely associated with college English teaching and learning, it is necessary to probe into the multimodality in college English books. First, the study showed that the images in college English textbooks have the following functions: to illustrate the text, to consolidate the information in the text, to aestheticize the text, as well as to collaborate with the text to establish harmony between the author and the reader, therefore to convey the author's intentions well. Second, the study showed that the multimodality in college English textbooks reflects power relation and solidarity relation between the author and the reader. On the one hand, multimodality in college English textbooks implies unequal power relationships which are represented by unequal power opportunities for the author to select images while the reader has little influence. On the other hand, the learning activities provide a lot of chances for the reader to make contact with the images, hence establishing a familiar and harmonious relationship between the reader and the author. The findings of the present study have useful implications for college English teaching and learning and imply the need for further studies in this field.
文摘This paper seeks to examine the image and text relationship in TANG Yin's scroll of poetry and painting from three aspects: The first aspect focuses upon the schema type of its image and text relationship in physical form; the second aspect, explores the text's/poetry's functions of anchorage and relay while appreciating those images/paintings; the third aspect, traces the semiosis process of image, exploring how image and text as cultural products in the epistemological world mediates with the phenomenological world