To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the i...To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the input text,and then sent the expanded text to both the context encoder BERT and the structure encoder GAT to capture the contextual relationship features and structural features of the input text.Finally,the match was determined based on the fusion result of the two features.Experiment results based on the public datasets BQ_corpus and LCQMC showed that KS-BERT outperforms advanced models such as ERNIE 2.0.This Study showed that knowledge enhancement and structure enhancement are two effective ways to improve BERT in short text matching.In BQ_corpus,ACC was improved by 0.2%and 0.3%,respectively,while in LCQMC,ACC was improved by 0.4%and 0.9%,respectively.展开更多
Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing me...Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.展开更多
Object positioning and 3D modeling is not only a promising research direction in robot welding, but also an emphasis research field of machine vision. For objects lacking of textures on its surface, such as welding pa...Object positioning and 3D modeling is not only a promising research direction in robot welding, but also an emphasis research field of machine vision. For objects lacking of textures on its surface, such as welding part, problem of 3D modeling of this kind of object can not be settled by traditional binocular stereo vision. In this paper, an approach to the problem is presented. A multibaseline stereo vision system with four verging configured cameras is proposed. Experiments show that we can get dense data points by the method, the results of surface reconstruction of objects with no matching texture on its surface is good. Experiment results confirmed that the method is valid in solving the problem of 3D surface reconstruction of objects lacking of matching texture.展开更多
文摘To improve the accuracy of short text matching,a short text matching method with knowledge and structure enhancement for BERT(KS-BERT)was proposed in this study.This method first introduced external knowledge to the input text,and then sent the expanded text to both the context encoder BERT and the structure encoder GAT to capture the contextual relationship features and structural features of the input text.Finally,the match was determined based on the fusion result of the two features.Experiment results based on the public datasets BQ_corpus and LCQMC showed that KS-BERT outperforms advanced models such as ERNIE 2.0.This Study showed that knowledge enhancement and structure enhancement are two effective ways to improve BERT in short text matching.In BQ_corpus,ACC was improved by 0.2%and 0.3%,respectively,while in LCQMC,ACC was improved by 0.4%and 0.9%,respectively.
基金National Natural Science Foundation of China(No.61971121)。
文摘Clothing attribute recognition has become an essential technology,which enables users to automatically identify the characteristics of clothes and search for clothing images with similar attributes.However,existing methods cannot recognize newly added attributes and may fail to capture region-level visual features.To address the aforementioned issues,a region-aware fashion contrastive language-image pre-training(RaF-CLIP)model was proposed.This model aligned cropped and segmented images with category and multiple fine-grained attribute texts,achieving the matching of fashion region and corresponding texts through contrastive learning.Clothing retrieval found suitable clothing based on the user-specified clothing categories and attributes,and to further improve the accuracy of retrieval,an attribute-guided composed network(AGCN)as an additional component on RaF-CLIP was introduced,specifically designed for composed image retrieval.This task aimed to modify the reference image based on textual expressions to retrieve the expected target.By adopting a transformer-based bidirectional attention and gating mechanism,it realized the fusion and selection of image features and attribute text features.Experimental results show that the proposed model achieves a mean precision of 0.6633 for attribute recognition tasks and a recall@10(recall@k is defined as the percentage of correct samples appearing in the top k retrieval results)of 39.18 for composed image retrieval task,satisfying user needs for freely searching for clothing through images and texts.
基金The work is partly supported by the emphasis project of national natural science foundation(No.59635160);assisted by the young foundation of 863(No.863-2.99.7).
文摘Object positioning and 3D modeling is not only a promising research direction in robot welding, but also an emphasis research field of machine vision. For objects lacking of textures on its surface, such as welding part, problem of 3D modeling of this kind of object can not be settled by traditional binocular stereo vision. In this paper, an approach to the problem is presented. A multibaseline stereo vision system with four verging configured cameras is proposed. Experiments show that we can get dense data points by the method, the results of surface reconstruction of objects with no matching texture on its surface is good. Experiment results confirmed that the method is valid in solving the problem of 3D surface reconstruction of objects lacking of matching texture.