The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu...The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).展开更多
The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the...The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.展开更多
To fully describe the structure information of the point cloud when the LIDAR-object distance is long,a joint global and local feature(JGLF)descriptor is constructed.Compared with five typical descriptors,the object r...To fully describe the structure information of the point cloud when the LIDAR-object distance is long,a joint global and local feature(JGLF)descriptor is constructed.Compared with five typical descriptors,the object recognition rate of JGLF is higher when the LIDAR-object distances change.Under the situation that airborne LIDAR is getting close to the object,the particle filtering(PF)algorithm is used as the tracking frame.Particle weight is updated by comparing the difference between JGLFs to track the object.It is verified that the proposed algorithm performs 13.95%more accurately and stably than the basic PF algorithm.展开更多
Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local fea...Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.展开更多
Simultaneous Localization and Mapping(SLAM)has been widely used in emergency response,self-driving and city-scale 3D mapping and navigation.Recent deep-learning based feature point extractors have demonstrated superio...Simultaneous Localization and Mapping(SLAM)has been widely used in emergency response,self-driving and city-scale 3D mapping and navigation.Recent deep-learning based feature point extractors have demonstrated superior performance in dealing with the complex environmental challenges(e.g.extreme lighting)while the traditional extractors are struggling.In this paper,we have successfully improved the robustness and accuracy of a monocular visual SLAM system under various complex scenes by adding a deep learning based visual localization thread as an augmentation to the visual SLAM framework.In this thread,our feature extractor with an efficient lightweight deep neural network is used for absolute pose and scale estimation in real time using the highly accurate georeferenced prior map database at 20cm geometric accuracy created by our in-house and low-cost LiDAR and camera integrated device.The closed-loop error provided by our SLAM system with and without this enhancement is 1.03m and 18.28m respectively.The scale estimation of the monocular visual SLAM is also significantly improved(0.01 versus 0.98).In addition,a novel camera-LiDAR calibration workflow is also provided for large-scale 3D mapping.This paper demonstrates the application and research potential of deep-learning based vision SLAM with image and LiDAR sensors.展开更多
In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to t...In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.展开更多
Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-i...Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.展开更多
Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convo...Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.展开更多
The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitchi...The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitching, or data mining relying on it. Local feature description techniques are usually developed so as to provide invariance to photometric variations specific to the acquisition of natural images, but are nonetheless used in association with biomedical imaging as well. It has been previously shown that the matching of gradient based descriptors is affected by image modifications specific to Confocal Scanning Laser Microscopy (CSLM). In this paper we extend our previous work in this direction and show how specific acquisition or post-processing methods alleviate or accentuate this problem.展开更多
The quality of egg is mainly influenced by the dirt adhering to its shell.Even with good farm-management practices and careful handling,a small percentage of dirty eggs will be produced.The purpose of this research wa...The quality of egg is mainly influenced by the dirt adhering to its shell.Even with good farm-management practices and careful handling,a small percentage of dirty eggs will be produced.The purpose of this research was to detect the egg stains by using image processing technique.Compared to the color values,the local texture was found to be much more adept at accurately segmenting of the complex and miscellaneous dirt stains on the egg shell.Firstly,the global threshold of the image was obtained by two-peak method.The irrelevant background was removed by using the global threshold and the interested region was acquired.The local texture information extracted from the interested region was taken as the input of fuzzy C-means clustering for segmentation of the dirt stains.According to the principle of projection,the area of dirt stains on the curved egg surface was accurately calculated.The validation experimental results showed that the proposed method for classifying eggs in terms of stain has the specificity of 91.4%for white eggs and 89.5%for brown eggs.展开更多
With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is...With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.展开更多
Traditional hand-crafted features for representing local image patches are evolving into current data-driven and learning-based image feature, but learning a robust and discriminative descriptor which is capable of co...Traditional hand-crafted features for representing local image patches are evolving into current data-driven and learning-based image feature, but learning a robust and discriminative descriptor which is capable of controlling various patch-level computer vision tasks is still an open problem. In this work, we propose a novel deep convolutional neural network(CNN) to learn local feature descriptors. We utilize the quadruplets with positive and negative training samples, together with a constraint to restrict the intra-class variance, to learn good discriminative CNN representations. Compared with previous works, our model reduces the overlap in feature space between corresponding and non-corresponding patch pairs, and mitigates margin varying problem caused by commonly used triplet loss. We demonstrate that our method achieves better embedding result than some latest works, like PN-Net and TN-TG, on benchmark dataset.展开更多
In the past ten years,research on face recognition has shifted to using 3D facial surfaces,as 3D geometric information provides more discriminative features.This comprehensive survey reviews 3D face recognition techni...In the past ten years,research on face recognition has shifted to using 3D facial surfaces,as 3D geometric information provides more discriminative features.This comprehensive survey reviews 3D face recognition techniques developed in the past decade,both conventional methods and deep learning methods.These methods are evaluated with detailed descriptions of selected representative works.Their advantages and disadvantages are summarized in terms of accuracy,complexity,and robustness to facial variations(expression,pose,occlusion,etc.).A review of 3D face databases is also provided,and a discussion of future research challenges and directions of the topic.展开更多
In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the...In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the impact of background,veil,clothing and other changes on the recognition effect,this paper proposes a pedestrian re-recognition method based on the cycle-consistent generative adversarial network and multifeature fusion.By comparing the measured distance between two pedestrians,pedestrian re-recognition is accomplished.Firstly,this paper uses Cycle GAN to transform and expand the data set,so as to reduce the influence of pedestrian posture changes as much as possible.The method consists of two branches:global feature extraction and local feature extraction.Then the global feature and local feature are fused.The fused features are used for comparison measurement learning,and the similarity scores are calculated to sort the samples.A large number of experimental results on large data sets CUHK03 and VIPER show that this new method reduces the influence of background,veil,clothing and other changes on the recognition effect.展开更多
Fine-grained few-shot learning is a difficult task in image classification. The reason is that the discriminative features of fine-grained images are often located in local areas of the image, while most of the existi...Fine-grained few-shot learning is a difficult task in image classification. The reason is that the discriminative features of fine-grained images are often located in local areas of the image, while most of the existing few-shot learning image classification methods only use top-level features and adopt a single measure. In that way, the local features of the sample cannot be learned well. In response to this problem, ensemble relation network with multi-level measure(ERN-MM) is proposed in this paper. It adds the relation modules in the shallow feature space to compare the similarity between the samples in the local features, and finally integrates the similarity scores from the feature spaces to assign the label of the query samples. So the proposed method ERN-MM can use local details and global information of different grains. Experimental results on different fine-grained datasets show that the proposed method achieves good classification performance and also proves its rationality.展开更多
基金Foundation items:Shanghai Sailing Program,China (No. 21YF1401300)Shanghai Science and Technology Innovation Action Plan,China (No.19511101802)Fundamental Research Funds for the Central Universities,China (No.2232021D-25)。
文摘The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).
基金This research was funded by the Science and Technology Support Plan Project of Hebei Province(grant numbers 17210803D and 19273703D)the Science and Technology Spark Project of the Hebei Seismological Bureau(grant number DZ20180402056)+1 种基金the Education Department of Hebei Province(grant number QN2018095)the Polytechnic College of Hebei University of Science and Technology.
文摘The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.
基金This work was supported by the National Natural Science Foundation of China(Nos.61271353 and 61871389)Foundation of State Key Laboratory of Pulsed Power Laser Technology(No.SKL2018ZR09)Major Funding Projects of National University of Defense Technology(No.ZK18-01-02).
文摘To fully describe the structure information of the point cloud when the LIDAR-object distance is long,a joint global and local feature(JGLF)descriptor is constructed.Compared with five typical descriptors,the object recognition rate of JGLF is higher when the LIDAR-object distances change.Under the situation that airborne LIDAR is getting close to the object,the particle filtering(PF)algorithm is used as the tracking frame.Particle weight is updated by comparing the difference between JGLFs to track the object.It is verified that the proposed algorithm performs 13.95%more accurately and stably than the basic PF algorithm.
基金supported by the National Basic Research Program (973) of China (No. 2012CB821206)the National Natural Science Foundation of China (No. 71201004)+1 种基金the Scientific Research Common Program of Beijing Municipal Commission of Education (No. KM201310011009)the Funding Project for Innovation on Science, Technology and Graduate Education in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality (Nos. PXM2012_014213_000037 and PXM2012_014213_000079)
文摘Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.
基金supported by the National Key Research and Development Program of China under[Grant number 2019YFC1511304]supported by the Pilot Fund of Frontier Science and Disruptive Technology of Aerospace Information Research Institute,Chinese Academy of Sciences under[Grant number E0Z21101].
文摘Simultaneous Localization and Mapping(SLAM)has been widely used in emergency response,self-driving and city-scale 3D mapping and navigation.Recent deep-learning based feature point extractors have demonstrated superior performance in dealing with the complex environmental challenges(e.g.extreme lighting)while the traditional extractors are struggling.In this paper,we have successfully improved the robustness and accuracy of a monocular visual SLAM system under various complex scenes by adding a deep learning based visual localization thread as an augmentation to the visual SLAM framework.In this thread,our feature extractor with an efficient lightweight deep neural network is used for absolute pose and scale estimation in real time using the highly accurate georeferenced prior map database at 20cm geometric accuracy created by our in-house and low-cost LiDAR and camera integrated device.The closed-loop error provided by our SLAM system with and without this enhancement is 1.03m and 18.28m respectively.The scale estimation of the monocular visual SLAM is also significantly improved(0.01 versus 0.98).In addition,a novel camera-LiDAR calibration workflow is also provided for large-scale 3D mapping.This paper demonstrates the application and research potential of deep-learning based vision SLAM with image and LiDAR sensors.
基金Supported by the Major Program of National Natural Science Foundation of China (No. 70890080 and No. 70890083)
文摘In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.
文摘Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.
基金This work was partially supported by the Key Laboratory for Digital Land and Resources of Jiangxi Province,East China University of Technology(DLLJ202103)Science and Technology Commission Shanghai Municipality(No.19142201600)Graduate Innovation and Entrepreneurship Program in Shanghai University in China(No.2019GY04).
文摘Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.
基金The UEFISCDIPN-II-PT-PCCA-2011-3.2-1162 Research Grant The CRUS SCIEX NMS-CH Fellowship nr. 12.135
文摘The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitching, or data mining relying on it. Local feature description techniques are usually developed so as to provide invariance to photometric variations specific to the acquisition of natural images, but are nonetheless used in association with biomedical imaging as well. It has been previously shown that the matching of gradient based descriptors is affected by image modifications specific to Confocal Scanning Laser Microscopy (CSLM). In this paper we extend our previous work in this direction and show how specific acquisition or post-processing methods alleviate or accentuate this problem.
基金The authors gratefully acknowledge the financial support of the National Science&Technology Pillar Program(2015BAD19B05).
文摘The quality of egg is mainly influenced by the dirt adhering to its shell.Even with good farm-management practices and careful handling,a small percentage of dirty eggs will be produced.The purpose of this research was to detect the egg stains by using image processing technique.Compared to the color values,the local texture was found to be much more adept at accurately segmenting of the complex and miscellaneous dirt stains on the egg shell.Firstly,the global threshold of the image was obtained by two-peak method.The irrelevant background was removed by using the global threshold and the interested region was acquired.The local texture information extracted from the interested region was taken as the input of fuzzy C-means clustering for segmentation of the dirt stains.According to the principle of projection,the area of dirt stains on the curved egg surface was accurately calculated.The validation experimental results showed that the proposed method for classifying eggs in terms of stain has the specificity of 91.4%for white eggs and 89.5%for brown eggs.
基金This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2018R1D1A1A02085645)Also,this work was supported by the KoreaMedical Device Development Fund grant funded by the Korean government(the Ministry of Science and ICT,the Ministry of Trade,Industry and Energy,the Ministry of Health&Welfare,theMinistry of Food and Drug Safety)(Project Number:202012D05-02).
文摘With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.
基金supported by the Natural Science Foundation of Zhejiang Province(No.Y16F020023)
文摘Traditional hand-crafted features for representing local image patches are evolving into current data-driven and learning-based image feature, but learning a robust and discriminative descriptor which is capable of controlling various patch-level computer vision tasks is still an open problem. In this work, we propose a novel deep convolutional neural network(CNN) to learn local feature descriptors. We utilize the quadruplets with positive and negative training samples, together with a constraint to restrict the intra-class variance, to learn good discriminative CNN representations. Compared with previous works, our model reduces the overlap in feature space between corresponding and non-corresponding patch pairs, and mitigates margin varying problem caused by commonly used triplet loss. We demonstrate that our method achieves better embedding result than some latest works, like PN-Net and TN-TG, on benchmark dataset.
文摘In the past ten years,research on face recognition has shifted to using 3D facial surfaces,as 3D geometric information provides more discriminative features.This comprehensive survey reviews 3D face recognition techniques developed in the past decade,both conventional methods and deep learning methods.These methods are evaluated with detailed descriptions of selected representative works.Their advantages and disadvantages are summarized in terms of accuracy,complexity,and robustness to facial variations(expression,pose,occlusion,etc.).A review of 3D face databases is also provided,and a discussion of future research challenges and directions of the topic.
文摘In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the impact of background,veil,clothing and other changes on the recognition effect,this paper proposes a pedestrian re-recognition method based on the cycle-consistent generative adversarial network and multifeature fusion.By comparing the measured distance between two pedestrians,pedestrian re-recognition is accomplished.Firstly,this paper uses Cycle GAN to transform and expand the data set,so as to reduce the influence of pedestrian posture changes as much as possible.The method consists of two branches:global feature extraction and local feature extraction.Then the global feature and local feature are fused.The fused features are used for comparison measurement learning,and the similarity scores are calculated to sort the samples.A large number of experimental results on large data sets CUHK03 and VIPER show that this new method reduces the influence of background,veil,clothing and other changes on the recognition effect.
基金supported by the National Natural Science Foundation of China(62176110,62111530146,61906080)Young Doctoral Fund of Education Department of Gansu Province(2021QB-038)。
文摘Fine-grained few-shot learning is a difficult task in image classification. The reason is that the discriminative features of fine-grained images are often located in local areas of the image, while most of the existing few-shot learning image classification methods only use top-level features and adopt a single measure. In that way, the local features of the sample cannot be learned well. In response to this problem, ensemble relation network with multi-level measure(ERN-MM) is proposed in this paper. It adds the relation modules in the shallow feature space to compare the similarity between the samples in the local features, and finally integrates the similarity scores from the feature spaces to assign the label of the query samples. So the proposed method ERN-MM can use local details and global information of different grains. Experimental results on different fine-grained datasets show that the proposed method achieves good classification performance and also proves its rationality.