The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula...The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.展开更多
The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu...The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).展开更多
Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description cons...Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration whe...Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.展开更多
The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the...The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.展开更多
This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geom...This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.展开更多
Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and ori...Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.展开更多
The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlat...The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.展开更多
In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to t...In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.展开更多
In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of ...In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.展开更多
In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the...In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the impact of background,veil,clothing and other changes on the recognition effect,this paper proposes a pedestrian re-recognition method based on the cycle-consistent generative adversarial network and multifeature fusion.By comparing the measured distance between two pedestrians,pedestrian re-recognition is accomplished.Firstly,this paper uses Cycle GAN to transform and expand the data set,so as to reduce the influence of pedestrian posture changes as much as possible.The method consists of two branches:global feature extraction and local feature extraction.Then the global feature and local feature are fused.The fused features are used for comparison measurement learning,and the similarity scores are calculated to sort the samples.A large number of experimental results on large data sets CUHK03 and VIPER show that this new method reduces the influence of background,veil,clothing and other changes on the recognition effect.展开更多
Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-i...Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.展开更多
Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and te...Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.展开更多
A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman'...A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.展开更多
A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective....A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective. A new three-layer detection method was proposed to detect and identify white-backed planthoppers (WBPHs, Sogatella furcifera (Horvath)) and their developmental stages using image processing. In the first two detection layers, we used an AdaBoost classifier that was trained on a histogram of oriented gradient (HOG) features and a support vector machine (SVM) classifier that was trained on Gabor and Local Binary Pattern (LBP) features to detect WBPHs and remove impurities. We achieved a detection rate of 85.6% and a false detection rate of 10.2%. In the third detection layer, a SVM classifier that was trained on the HOG features was used to identify the different developmental stages of the WBPHs, and we achieved an identification rate of 73.1%, a false identification rate of 23.3%, and a 5.6% false detection rate for the images without WBPHs. The proposed three-layer detection method is feasible and effective for the identification of different developmental stages of planthoppers on rice plants in paddy fields.展开更多
With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is...With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.展开更多
How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family ...How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family filtration in object region. First, we supply an object region by selecting a rectangle in a query image such that system returns a ranked list of images that contain the same object, retrieved from the corpus based on 100 images, as a result of the first rank. To further improve retrieval performance, we add an efficient spatial consistency stage, which is named family-based spatial consistency filtration, to re-rank the results returned by the first rank. We elaborate the performance of the retrieval system by some experiments on the dataset selected from the key frames of "TREC Video Retrieval Evaluation 2005 (TRECVID2005)". The results of experiments show that the retrieval mechanism proposed by us has vast major effect on the retrieval quality. The paper also verifies the stability of the retrieval mechanism by increasing the number of images from 100 to 2000 and realizes generalized retrieval with the object outside the dataset.展开更多
Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convo...Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.展开更多
A two-stage object recognition algorithm with the presence of occlusion is presented for microassembly. Coarse localization determines whether template is in image or not and approximately where it is, and fine locali...A two-stage object recognition algorithm with the presence of occlusion is presented for microassembly. Coarse localization determines whether template is in image or not and approximately where it is, and fine localization gives its accurate position. In coarse localization, local feature, which is invariant to translation, rotation and occlusion, is used to form signatures. By comparing signature of template with that of image, approximate transformation parameter from template to image is obtained, which is used as initial parameter value for fine localization. An objective function, which is a function of transformation parameter, is constructed in fine localization and minimized to realize sub-pixel localization accuracy. The occluded pixels are not taken into account in objective function, so the localization accuracy will not be influenced by the occlusion.展开更多
基金The support of this research was by Hubei Provincial Natural Science Foundation(2022CFB449)Science Research Foundation of Education Department of Hubei Province(B2020061),are gratefully acknowledged.
文摘The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.
基金Foundation items:Shanghai Sailing Program,China (No. 21YF1401300)Shanghai Science and Technology Innovation Action Plan,China (No.19511101802)Fundamental Research Funds for the Central Universities,China (No.2232021D-25)。
文摘The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).
基金the National Natural Science Foundation of China,No.51705469the Zhengzhou University Youth Talent Enterprise Cooperative Innovation Team Support Program Project(2021,2022).
文摘Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
基金supported by the National Natural Science Foundation of China (60802043)the National Basic Research Program of China(973 Program) (2010CB327900)
文摘Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.
基金This research was funded by the Science and Technology Support Plan Project of Hebei Province(grant numbers 17210803D and 19273703D)the Science and Technology Spark Project of the Hebei Seismological Bureau(grant number DZ20180402056)+1 种基金the Education Department of Hebei Province(grant number QN2018095)the Polytechnic College of Hebei University of Science and Technology.
文摘The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.
基金Supported by the National Natural Science Foun-dation of China (60373062 ,60573045)
文摘This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.
基金supported by the Korea Research Foundation Grant funded by the Korean Government(MOEHRD),the MKE(The Ministry of Knowledge Economy,Korea)the ITRC(Information Technology Research Center)support program(NIPA-2009-(C1090-0902-0007))
文摘Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.
文摘The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.
基金Supported by the Major Program of National Natural Science Foundation of China (No. 70890080 and No. 70890083)
文摘In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.
基金Supported by Project of Shandong Province Higher Educational Science and Technology Program(No.J11LG77)
文摘In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.
文摘In pedestrian re-recognition,the traditional pedestrian re-recognition method will be affected by the changes of background,veil,clothing and so on,which will make the recognition effect decline.In order to reduce the impact of background,veil,clothing and other changes on the recognition effect,this paper proposes a pedestrian re-recognition method based on the cycle-consistent generative adversarial network and multifeature fusion.By comparing the measured distance between two pedestrians,pedestrian re-recognition is accomplished.Firstly,this paper uses Cycle GAN to transform and expand the data set,so as to reduce the influence of pedestrian posture changes as much as possible.The method consists of two branches:global feature extraction and local feature extraction.Then the global feature and local feature are fused.The fused features are used for comparison measurement learning,and the similarity scores are calculated to sort the samples.A large number of experimental results on large data sets CUHK03 and VIPER show that this new method reduces the influence of background,veil,clothing and other changes on the recognition effect.
文摘Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.
基金Reform and Practice of Practical Teaching System for Applied Translation Undergraduate Majors from the Perspective of Technology Hard Trend of Henan Province Education Reform Project in 2024(Project number:2024SJGLX0581)Teaching Reform Project of Zhengzhou University of Science and Technology in 2024,”Innovative Research on Practical Teaching of Digital-Intelligence Technology Enabling Production-Teaching Integration”(Project number:2024JGZD11).
文摘Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.
文摘A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.
基金financially supported by the National High Technology Research and Development Program of China (863 Program, 2013AA102402)the 521 Talent Project of Zhejiang Sci-Tech University, Chinathe Key Research and Development Program of Zhejiang Province, China (2015C03023)
文摘A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective. A new three-layer detection method was proposed to detect and identify white-backed planthoppers (WBPHs, Sogatella furcifera (Horvath)) and their developmental stages using image processing. In the first two detection layers, we used an AdaBoost classifier that was trained on a histogram of oriented gradient (HOG) features and a support vector machine (SVM) classifier that was trained on Gabor and Local Binary Pattern (LBP) features to detect WBPHs and remove impurities. We achieved a detection rate of 85.6% and a false detection rate of 10.2%. In the third detection layer, a SVM classifier that was trained on the HOG features was used to identify the different developmental stages of the WBPHs, and we achieved an identification rate of 73.1%, a false identification rate of 23.3%, and a 5.6% false detection rate for the images without WBPHs. The proposed three-layer detection method is feasible and effective for the identification of different developmental stages of planthoppers on rice plants in paddy fields.
基金This research was supported by the Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Education(No.2018R1D1A1A02085645)Also,this work was supported by the KoreaMedical Device Development Fund grant funded by the Korean government(the Ministry of Science and ICT,the Ministry of Trade,Industry and Energy,the Ministry of Health&Welfare,theMinistry of Food and Drug Safety)(Project Number:202012D05-02).
文摘With the advancement of computer vision techniques in surveillance systems,the need for more proficient,intelligent,and sustainable facial expressions and age recognition is necessary.The main purpose of this study is to develop accurate facial expressions and an age recognition system that is capable of error-free recognition of human expression and age in both indoor and outdoor environments.The proposed system first takes an input image pre-process it and then detects faces in the entire image.After that landmarks localization helps in the formation of synthetic face mask prediction.A novel set of features are extracted and passed to a classifier for the accurate classification of expressions and age group.The proposed system is tested over two benchmark datasets,namely,the Gallagher collection person dataset and the Images of Groups dataset.The system achieved remarkable results over these benchmark datasets about recognition accuracy and computational time.The proposed system would also be applicable in different consumer application domains such as online business negotiations,consumer behavior analysis,E-learning environments,and emotion robotics.
基金supported by National High Technology Research and Development Program of China (863 Program)(No.2007AA01Z416)National Natural Science Foundation of China (No.60773056)+1 种基金Beijing New Star Project on Science and Technology (No.2007B071)Natural Science Foundation of Liaoning Province of China (No.20052184)
文摘How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family filtration in object region. First, we supply an object region by selecting a rectangle in a query image such that system returns a ranked list of images that contain the same object, retrieved from the corpus based on 100 images, as a result of the first rank. To further improve retrieval performance, we add an efficient spatial consistency stage, which is named family-based spatial consistency filtration, to re-rank the results returned by the first rank. We elaborate the performance of the retrieval system by some experiments on the dataset selected from the key frames of "TREC Video Retrieval Evaluation 2005 (TRECVID2005)". The results of experiments show that the retrieval mechanism proposed by us has vast major effect on the retrieval quality. The paper also verifies the stability of the retrieval mechanism by increasing the number of images from 100 to 2000 and realizes generalized retrieval with the object outside the dataset.
基金This work was partially supported by the Key Laboratory for Digital Land and Resources of Jiangxi Province,East China University of Technology(DLLJ202103)Science and Technology Commission Shanghai Municipality(No.19142201600)Graduate Innovation and Entrepreneurship Program in Shanghai University in China(No.2019GY04).
文摘Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.
基金This project is supported by National Natural Science Foundation of China (No. 50275078)
文摘A two-stage object recognition algorithm with the presence of occlusion is presented for microassembly. Coarse localization determines whether template is in image or not and approximately where it is, and fine localization gives its accurate position. In coarse localization, local feature, which is invariant to translation, rotation and occlusion, is used to form signatures. By comparing signature of template with that of image, approximate transformation parameter from template to image is obtained, which is used as initial parameter value for fine localization. An objective function, which is a function of transformation parameter, is constructed in fine localization and minimized to realize sub-pixel localization accuracy. The occluded pixels are not taken into account in objective function, so the localization accuracy will not be influenced by the occlusion.