The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the...The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.展开更多
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula...The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.展开更多
3D model retrieval virtual reality applications. In can benefit many downstream this paper, we propose a new sketch-based 3D model retrieval framework by coupling local features and manifold ranking. At technical fron...3D model retrieval virtual reality applications. In can benefit many downstream this paper, we propose a new sketch-based 3D model retrieval framework by coupling local features and manifold ranking. At technical fronts, we exploit spatial pyramids based local structures to facilitate the efficient construction of feature descriptors. Meanwhile, we propose an improved manifold ranking method, wherein all the categories between arbitrary model pairs will be taken into account. Since the smooth and detail-preserving line drawings of 3D model are important for sketch-based 3D model retrieval, the Difference of Gaussians (DOG) method is employed to extract the line drawings over the projected depth images of 3D model, and Bezier Curve is then adopted to further optimize the extracted line drawing. On that basis, we develop a 3D model retrieval engine to verify our method. We have conducted extensive experiments over various public benchmarks, and have made comprehensive comparisons with some state-of-the-art 3D retrieval methods. All the evaluation results based on the widely-used indicators prove the superiority of our method in accuracy, reliability, robustness, and versatility.展开更多
Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local fea...Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.展开更多
This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geom...This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.展开更多
The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu...The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).展开更多
Background The classification of Alzheimer's disease (AD) from magnetic resonance imaging (MRI) has been challenged by lack of effective and reliable biomarkers due to inter-subject variability. This article pres...Background The classification of Alzheimer's disease (AD) from magnetic resonance imaging (MRI) has been challenged by lack of effective and reliable biomarkers due to inter-subject variability. This article presents a classification method for AD based on kernel density estimation (KDE) of local features. Methods First, a large number of local features were extracted from stable image blobs to represent various anatomical patterns for potential effective biomarkers. Based on distinctive descriptors and locations, the local features were robustly clustered to identify correspondences of the same underlying patterns. Then, the KDE was used to estimate distribution parameters of the correspondences by weighting contributions according to their distances. Thus, biomarkers could be reliably quantified by reducing the effects of further away correspondences which were more likely noises from inter-subject variability. Finally, the Bayes classifier was applied on the distribution parameters for the classification of AD. Results Experiments were performed on different divisions of a publicly available database to investigate the accuracy and the effects of age and AD severity. Our method achieved an equal error classification rate of 0.85 for subject aged 60-80 years exhibiting mild AD and outperformed a recent local feature-based work regardless of both effects. Conclusions We proposed a volumetric brain MRI classification method for neurodegenerative disease based on statistics of local features using KDE. The method may be potentially useful for the computer-aided diagnosis in clinical settings.展开更多
The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlat...The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.展开更多
Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description cons...Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.展开更多
In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of ...In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.展开更多
Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and ori...Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.展开更多
Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and te...Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.展开更多
Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration whe...Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.展开更多
In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to t...In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.展开更多
Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-i...Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.展开更多
With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal...With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.展开更多
A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman'...A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.展开更多
Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convo...Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.展开更多
The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitchi...The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitching, or data mining relying on it. Local feature description techniques are usually developed so as to provide invariance to photometric variations specific to the acquisition of natural images, but are nonetheless used in association with biomedical imaging as well. It has been previously shown that the matching of gradient based descriptors is affected by image modifications specific to Confocal Scanning Laser Microscopy (CSLM). In this paper we extend our previous work in this direction and show how specific acquisition or post-processing methods alleviate or accentuate this problem.展开更多
A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective....A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective. A new three-layer detection method was proposed to detect and identify white-backed planthoppers (WBPHs, Sogatella furcifera (Horvath)) and their developmental stages using image processing. In the first two detection layers, we used an AdaBoost classifier that was trained on a histogram of oriented gradient (HOG) features and a support vector machine (SVM) classifier that was trained on Gabor and Local Binary Pattern (LBP) features to detect WBPHs and remove impurities. We achieved a detection rate of 85.6% and a false detection rate of 10.2%. In the third detection layer, a SVM classifier that was trained on the HOG features was used to identify the different developmental stages of the WBPHs, and we achieved an identification rate of 73.1%, a false identification rate of 23.3%, and a 5.6% false detection rate for the images without WBPHs. The proposed three-layer detection method is feasible and effective for the identification of different developmental stages of planthoppers on rice plants in paddy fields.展开更多
基金This research was funded by the Science and Technology Support Plan Project of Hebei Province(grant numbers 17210803D and 19273703D)the Science and Technology Spark Project of the Hebei Seismological Bureau(grant number DZ20180402056)+1 种基金the Education Department of Hebei Province(grant number QN2018095)the Polytechnic College of Hebei University of Science and Technology.
文摘The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.
基金The support of this research was by Hubei Provincial Natural Science Foundation(2022CFB449)Science Research Foundation of Education Department of Hubei Province(B2020061),are gratefully acknowledged.
文摘The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.
基金The authors would like to thank Zhang Dongdong for his great help in experiments. This work was supported by the National Natural Science Foundation of China (Grant No. 61602324), the Scientific Research Project of Beijing Educational Committeen (KM201710028018), the open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (BUAA-VR-17KF-12) and Beijing Advanced Innovation Center for Imaging Technology (BAlCIT-2016004).
文摘3D model retrieval virtual reality applications. In can benefit many downstream this paper, we propose a new sketch-based 3D model retrieval framework by coupling local features and manifold ranking. At technical fronts, we exploit spatial pyramids based local structures to facilitate the efficient construction of feature descriptors. Meanwhile, we propose an improved manifold ranking method, wherein all the categories between arbitrary model pairs will be taken into account. Since the smooth and detail-preserving line drawings of 3D model are important for sketch-based 3D model retrieval, the Difference of Gaussians (DOG) method is employed to extract the line drawings over the projected depth images of 3D model, and Bezier Curve is then adopted to further optimize the extracted line drawing. On that basis, we develop a 3D model retrieval engine to verify our method. We have conducted extensive experiments over various public benchmarks, and have made comprehensive comparisons with some state-of-the-art 3D retrieval methods. All the evaluation results based on the widely-used indicators prove the superiority of our method in accuracy, reliability, robustness, and versatility.
基金supported by the National Basic Research Program (973) of China (No. 2012CB821206)the National Natural Science Foundation of China (No. 71201004)+1 种基金the Scientific Research Common Program of Beijing Municipal Commission of Education (No. KM201310011009)the Funding Project for Innovation on Science, Technology and Graduate Education in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality (Nos. PXM2012_014213_000037 and PXM2012_014213_000079)
文摘Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.
基金Supported by the National Natural Science Foun-dation of China (60373062 ,60573045)
文摘This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.
基金Foundation items:Shanghai Sailing Program,China (No. 21YF1401300)Shanghai Science and Technology Innovation Action Plan,China (No.19511101802)Fundamental Research Funds for the Central Universities,China (No.2232021D-25)。
文摘The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).
基金grants from Fundamental Research Funds for the Central University,National Natural Science Foundation of China,Beijing Nova Program,National Science and Technology Major Project of China,Beijing Natural Science Foundation,Major Project of National Social Science Foundation
文摘Background The classification of Alzheimer's disease (AD) from magnetic resonance imaging (MRI) has been challenged by lack of effective and reliable biomarkers due to inter-subject variability. This article presents a classification method for AD based on kernel density estimation (KDE) of local features. Methods First, a large number of local features were extracted from stable image blobs to represent various anatomical patterns for potential effective biomarkers. Based on distinctive descriptors and locations, the local features were robustly clustered to identify correspondences of the same underlying patterns. Then, the KDE was used to estimate distribution parameters of the correspondences by weighting contributions according to their distances. Thus, biomarkers could be reliably quantified by reducing the effects of further away correspondences which were more likely noises from inter-subject variability. Finally, the Bayes classifier was applied on the distribution parameters for the classification of AD. Results Experiments were performed on different divisions of a publicly available database to investigate the accuracy and the effects of age and AD severity. Our method achieved an equal error classification rate of 0.85 for subject aged 60-80 years exhibiting mild AD and outperformed a recent local feature-based work regardless of both effects. Conclusions We proposed a volumetric brain MRI classification method for neurodegenerative disease based on statistics of local features using KDE. The method may be potentially useful for the computer-aided diagnosis in clinical settings.
文摘The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.
基金the National Natural Science Foundation of China,No.51705469the Zhengzhou University Youth Talent Enterprise Cooperative Innovation Team Support Program Project(2021,2022).
文摘Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.
基金Supported by Project of Shandong Province Higher Educational Science and Technology Program(No.J11LG77)
文摘In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.
基金supported by the Korea Research Foundation Grant funded by the Korean Government(MOEHRD),the MKE(The Ministry of Knowledge Economy,Korea)the ITRC(Information Technology Research Center)support program(NIPA-2009-(C1090-0902-0007))
文摘Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.
基金Reform and Practice of Practical Teaching System for Applied Translation Undergraduate Majors from the Perspective of Technology Hard Trend of Henan Province Education Reform Project in 2024(Project number:2024SJGLX0581)Teaching Reform Project of Zhengzhou University of Science and Technology in 2024,”Innovative Research on Practical Teaching of Digital-Intelligence Technology Enabling Production-Teaching Integration”(Project number:2024JGZD11).
文摘Because of the ambiguity and dynamic nature of natural language,the research of named entity recognition is very challenging.As an international language,English plays an important role in the fields of science and technology,finance and business.Therefore,the early named entity recognition technology is mainly based on English,which is often used to identify the names of people,places and organizations in the text.International conferences in the field of natural language processing,such as CoNLL,MUC,and ACE,have identified named entity recognition as a specific evaluation task,and the relevant research uses evaluation corpus from English-language media organizations such as the Wall Street Journal,the New York Times,and Wikipedia.The research of named entity recognition on relevant data has achieved good results.Aiming at the sparse distribution of entities in text,a model combining local and global features is proposed.The model takes a single English character as input,and uses the local feature layer composed of local attention and convolution to process the text pieceby way of sliding window to construct the corresponding local features.In addition,the self-attention mechanism is used to generate the global features of the text to improve the recognition effect of the model on long sentences.Experiments on three data sets,Resume,MSRA and Weibo,show that the proposed method can effectively improve the model’s recognition of English named entities.
基金supported by the National Natural Science Foundation of China (60802043)the National Basic Research Program of China(973 Program) (2010CB327900)
文摘Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.
基金Supported by the Major Program of National Natural Science Foundation of China (No. 70890080 and No. 70890083)
文摘In this paper, we propose a product image retrieval method based on the object contour corners, image texture and color. The product image mainly highlights the object and its background is very simple. According to these characteristics, we represent the object using its contour, and detect the corners of contour to reduce the number of pixels. Every corner is described using its approximate curvature based on distance. In addition, the Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation (BVLC) texture features and color moment are extracted from image's HIS color space. Finally, dynamic time warping method is used to match features with different length. In order to demonstrate the effect of the proposed method, we carry out experiments in Mi-crosoft product image database, and compare it with other feature descriptors. The retrieval precision and recall curves show that our method is feasible.
文摘Person re-identification has emerged as a hotspot for computer vision research due to the growing demands of social public safety requirements and the quick development of intelligent surveillance networks.Person re-identification(Re-ID)in video surveillance system can track and identify suspicious people,track and statistically analyze persons.The purpose of person re-identification is to recognize the same person in different cameras.Deep learning-based person re-identification research has produced numerous remarkable outcomes as a result of deep learning's growing popularity.The purpose of this paperis to help researchers better understand where person re-identification research is at the moment and where it is headed.Firstly,this paper arranges the widely used datasets and assessment criteria in person re-identification and reviews the pertinent research on deep learning-based person re-identification techniques conducted in the last several years.Then,the commonly used method techniques are also discussed from four aspects:appearance features,metric learning,local features,and adversarial learning.Finally,future research directions in the field of person re-identification are outlooked.
文摘With the aim of extracting the features of face images in face recognition, a new method of face recognition by fusing global features and local features is presented. The global features are extracted using principal component analysis (PCA). Active appearance model (AAM) locates 58 facial fiducial points, from which 17 points are characterized as local features using the Gabor wavelet transform (GWT). Normalized global match degree (local match degree) can be obtained by global features (local features) of the probe image and each gallery image. After the fusion of normalized global match degree and normalized local match degree, the recognition result is the class that included the gallery image corresponding to the largest fused match degree. The method is evaluated by the recognition rates over two face image databases (AR and SJTU-IPPR). The experimental results show that the method outperforms PCA and elastic bunch graph matching (EBGM). Moreover, it is effective and robust to expression, illumination and pose variation in some degree.
文摘A new method for solving the tiling problem of surface reconstruction is proposed. The proposed method uses a snake algorithm to segment the original images, the contours are then transformed into strings by Freeman' s code. Symbolic string matching technique is applied to establish a correspondence between the two consecutive contours. The surface is composed of the pieces reconstructed from the correspondence points. Experimental results show that the proposed method exhibits a good behavior for the quality of surface reconstruction and its time complexity is proportional to mn where m and n are the numbers of vertices of the two consecutive slices, respectively.
基金This work was partially supported by the Key Laboratory for Digital Land and Resources of Jiangxi Province,East China University of Technology(DLLJ202103)Science and Technology Commission Shanghai Municipality(No.19142201600)Graduate Innovation and Entrepreneurship Program in Shanghai University in China(No.2019GY04).
文摘Target detection of small samples with a complex background is always difficult in the classification of remote sensing images.We propose a new small sample target detection method combining local features and a convolutional neural network(LF-CNN)with the aim of detecting small numbers of unevenly distributed ground object targets in remote sensing images.The k-nearest neighbor method is used to construct the local neighborhood of each point and the local neighborhoods of the features are extracted one by one from the convolution layer.All the local features are aggregated by maximum pooling to obtain global feature representation.The classification probability of each category is then calculated and classified using the scaled expected linear units function and the full connection layer.The experimental results show that the proposed LF-CNN method has a high accuracy of target detection and classification for hyperspectral imager remote sensing data under the condition of small samples.Despite drawbacks in both time and complexity,the proposed LF-CNN method can more effectively integrate the local features of ground object samples and improve the accuracy of target identification and detection in small samples of remote sensing images than traditional target detection methods.
基金The UEFISCDIPN-II-PT-PCCA-2011-3.2-1162 Research Grant The CRUS SCIEX NMS-CH Fellowship nr. 12.135
文摘The matching of local descriptors represents at this moment a key tool in computer vision, with a wide variety of methods designed for tasks such as image classification, object recognition and tracking, image stitching, or data mining relying on it. Local feature description techniques are usually developed so as to provide invariance to photometric variations specific to the acquisition of natural images, but are nonetheless used in association with biomedical imaging as well. It has been previously shown that the matching of gradient based descriptors is affected by image modifications specific to Confocal Scanning Laser Microscopy (CSLM). In this paper we extend our previous work in this direction and show how specific acquisition or post-processing methods alleviate or accentuate this problem.
基金financially supported by the National High Technology Research and Development Program of China (863 Program, 2013AA102402)the 521 Talent Project of Zhejiang Sci-Tech University, Chinathe Key Research and Development Program of Zhejiang Province, China (2015C03023)
文摘A survey of the population densities of rice planthoppers is important for forecasting decisions and efficient control. Tra- ditional manual surveying of rice planthoppers is time-consuming, fatiguing, and subjective. A new three-layer detection method was proposed to detect and identify white-backed planthoppers (WBPHs, Sogatella furcifera (Horvath)) and their developmental stages using image processing. In the first two detection layers, we used an AdaBoost classifier that was trained on a histogram of oriented gradient (HOG) features and a support vector machine (SVM) classifier that was trained on Gabor and Local Binary Pattern (LBP) features to detect WBPHs and remove impurities. We achieved a detection rate of 85.6% and a false detection rate of 10.2%. In the third detection layer, a SVM classifier that was trained on the HOG features was used to identify the different developmental stages of the WBPHs, and we achieved an identification rate of 73.1%, a false identification rate of 23.3%, and a 5.6% false detection rate for the images without WBPHs. The proposed three-layer detection method is feasible and effective for the identification of different developmental stages of planthoppers on rice plants in paddy fields.