This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geom...This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.展开更多
The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregula...The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.展开更多
This paper presents a method for lane boundaries detection which is not affected by the shadows, illumination and un-even road conditions. This method is based upon processing grayscale images using local gradient fea...This paper presents a method for lane boundaries detection which is not affected by the shadows, illumination and un-even road conditions. This method is based upon processing grayscale images using local gradient features, characteris-tic spectrum of lanes, and linear prediction. Firstly, points on the adjacent right and left lane are recognized using the local gradient descriptors. A simple linear prediction model is deployed to predict the direction of lane markers. The contribution of this paper is the use of vertical gradient image without converting into binary image(using suitable thre-shold), and introduction of characteristic lane gradient spectrum within the local window to locate the preciselane marking points along the horizontal scan line over the image. Experimental results show that this method has greater tolerance to shadows and low illumination conditions. A comparison is drawn between this method and recent methods reported in the literature.展开更多
The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the...The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.展开更多
In recent years,simultaneous localization and mapping in dynamic environments(dynamic SLAM)has attracted significant attention from both academia and industry.Some pioneering work on this technique has expanded the po...In recent years,simultaneous localization and mapping in dynamic environments(dynamic SLAM)has attracted significant attention from both academia and industry.Some pioneering work on this technique has expanded the potential of robotic applications.Compared to standard SLAM under the static world assumption,dynamic SLAM divides features into static and dynamic categories and leverages each type of feature properly.Therefore,dynamic SLAM can provide more robust localization for intelligent robots that operate in complex dynamic environments.Additionally,to meet the demands of some high-level tasks,dynamic SLAM can be integrated with multiple object tracking.This article presents a survey on dynamic SLAM from the perspective of feature choices.A discussion of the advantages and disadvantages of different visual features is provided in this article.展开更多
With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain s...With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain statistical features(NSSTds)and local three dimensional local ternary pattern(3D-LTP)features,is proposed for high-resolution remote sensing images.We model the NSST image coefficients of detail subbands using 2-state laplacian mixture(LM)distribution and its three parameters are estimated using Expectation-Maximization(EM)algorithm.We also calculate the statistical parameters such as subband kurtosis and skewness from detail subbands along with mean and standard deviation calculated from approximation subband,and concatenate all of them with the 2-state LM parameters to describe the global features of the image.The various properties of NSST such as multiscale,localization and flexible directional sensitivity make it a suitable choice to provide an effective approximation of an image.In order to extract the dense local features,a new 3D-LTP is proposed where dimension reduction is performed via selection of‘uniform’patterns.The 3D-LTP is calculated from spatial RGB planes of the input image.The proposed inter-channel 3D-LTP not only exploits the local texture information but the color information is captured too.Finally,a fused feature representation(NSSTds-3DLTP)is proposed using new global(NSSTds)and local(3D-LTP)features to enhance the discriminativeness of features.The retrieval performance of proposed NSSTds-3DLTP features are tested on three challenging remote sensing image datasets such as WHU-RS19,Aerial Image Dataset(AID)and PatternNet in terms of mean average precision(MAP),average normalized modified retrieval rank(ANMRR)and precision-recall(P-R)graph.The experimental results are encouraging and the NSSTds-3DLTP features leads to superior retrieval performance compared to many well known existing descriptors such as Gabor RGB,Granulometry,local binary pattern(LBP),Fisher vector(FV),vector of locally aggregated descriptors(VLAD)and median robust extended local binary pattern(MRELBP).For WHU-RS19 dataset,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{41.93%,20.87%},{92.30%,32.68%},{86.14%,31.97%},{18.18%,15.22%},{8.96%,19.60%}and{15.60%,13.26%},respectively.For AID,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{152.60%,22.06%},{226.65%,25.08%},{185.03%,23.33%},{80.06%,12.16%},{50.58%,10.49%}and{62.34%,3.24%},respectively.For PatternNet,the NSSTds-3DLTP respectively improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{32.79%,10.34%},{141.30%,24.72%},{17.47%,10.34%},{83.20%,19.07%},{21.56%,3.60%},and{19.30%,0.48%}in terms of{MAP,ANMRR}.The moderate dimensionality of simple NSSTds-3DLTP allows the system to run in real-time.展开更多
The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlat...The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.展开更多
In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of ...In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.展开更多
Based on feature compression with orthogonal locality preserving projection(OLPP),a novel fault diagnosis model is proposed in this paper to achieve automation and high-precision of fault diagnosis of rotating machi...Based on feature compression with orthogonal locality preserving projection(OLPP),a novel fault diagnosis model is proposed in this paper to achieve automation and high-precision of fault diagnosis of rotating machinery.With this model,the original vibration signals of training and test samples are first decomposed through the empirical mode decomposition(EMD),and Shannon entropy is constructed to achieve high-dimensional eigenvectors.In order to replace the traditional feature extraction way which does the selection manually,OLPP is introduced to automatically compress the high-dimensional eigenvectors of training and test samples into the low-dimensional eigenvectors which have better discrimination.After that,the low-dimensional eigenvectors of training samples are input into Morlet wavelet support vector machine(MWSVM) and a trained MWSVM is obtained.Finally,the low-dimensional eigenvectors of test samples are input into the trained MWSVM to carry out fault diagnosis.To evaluate our proposed model,the experiment of fault diagnosis of deep groove ball bearings is made,and the experiment results indicate that the recognition accuracy rate of the proposed diagnosis model for outer race crack、inner race crack and ball crack is more than 90%.Compared to the existing approaches,the proposed diagnosis model combines the strengths of EMD in fault feature extraction,OLPP in feature compression and MWSVM in pattern recognition,and realizes the automation and high-precision of fault diagnosis.展开更多
Feature selection has been widely used in data mining and machine learning.Its objective is to select a minimal subset of features according to some reasonable criteria so as to solve the original task more quickly.In...Feature selection has been widely used in data mining and machine learning.Its objective is to select a minimal subset of features according to some reasonable criteria so as to solve the original task more quickly.In this article,a feature selection algorithm with local search strategy based on the forest optimization algorithm,namely FSLSFOA,is proposed.The novel local search strategy in local seeding process guarantees the quality of the feature subset in the forest.Next,the fitness function is improved,which not only considers the classification accuracy,but also considers the size of the feature subset.To avoid falling into local optimum,a novel global seeding method is attempted,which selects trees on the bottom of candidate set and gives the algorithm more diversities.Finally,FSLSFOA is compared with four feature selection methods to verify its effectiveness.Most of the results are superior to these comparative methods.展开更多
This paper introduces an indoor global localization method by extending and matching features. In the proposed method, the environment is partitioned into convex subdivisions. Local extended maps of the subdivisions a...This paper introduces an indoor global localization method by extending and matching features. In the proposed method, the environment is partitioned into convex subdivisions. Local extended maps of the subdivisions are then built by exten- ding features to constitute the local extended map set. While the robot is moving in the environment, the local extended map of the current local environment is established and then matched with the local extended map set. Therefore, global localization in an indoor environment can be achieved by integrating the position and ori- entation matching rates. Both theoretical analysis and comparison experimental result are provided to verify the effectiveness of the proposed method for global localization.展开更多
In this research, a content-based image retrieval (CBIR) system for high resolution satellite images has been developed by using texture features. The proposed approach uses the local binary pattern (LBP) texture ...In this research, a content-based image retrieval (CBIR) system for high resolution satellite images has been developed by using texture features. The proposed approach uses the local binary pattern (LBP) texture feature and a block based scheme. The query and database images are divided into equally sized blocks, from which LBP histograms are extracted. The block histograms are then compared by using the Chi-square distance. Experimental results show that the LBP representation provides a powerful tool for high resolution satellite images (HRSI) retrieval.展开更多
The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the qu...The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).展开更多
Collective improvement in the acceptable or desirable accuracy level of breast cancer image-related pattern recognition using various schemes remains challenging.Despite the combination of multiple schemes to achieve ...Collective improvement in the acceptable or desirable accuracy level of breast cancer image-related pattern recognition using various schemes remains challenging.Despite the combination of multiple schemes to achieve superior ultrasound image pattern recognition by reducing the speckle noise,an enhanced technique is not achieved.The purpose of this study is to introduce a features-based fusion scheme based on enhancement uniform-Local Binary Pattern(LBP)and filtered noise reduction.To surmount the above limitations and achieve the aim of the study,a new descriptor that enhances the LBP features based on the new threshold has been proposed.This paper proposes a multi-level fusion scheme for the auto-classification of the static ultrasound images of breast cancer,which was attained in two stages.First,several images were generated from a single image using the pre-processing method.Themedian andWiener filterswere utilized to lessen the speckle noise and enhance the ultrasound image texture.This strategy allowed the extraction of a powerful feature by reducing the overlap between the benign and malignant image classes.Second,the fusion mechanism allowed the production of diverse features from different filtered images.The feasibility of using the LBP-based texture feature to categorize the ultrasound images was demonstrated.The effectiveness of the proposed scheme is tested on 250 ultrasound images comprising 100 and 150 benign and malignant images,respectively.The proposed method achieved very high accuracy(98%),sensitivity(98%),and specificity(99%).As a result,the fusion process that can help achieve a powerful decision based on different features produced from different filtered images improved the results of the new descriptor of LBP features in terms of accuracy,sensitivity,and specificity.展开更多
Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and ori...Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
In order to explore the structural features of neural networks and the ap-proaches to local interconnection,the geometrical structural information is introduced tothe Hopfield neural network model which is applied to ...In order to explore the structural features of neural networks and the ap-proaches to local interconnection,the geometrical structural information is introduced tothe Hopfield neural network model which is applied to associative memory.The dynamicsof the recalling is studied theoretically and cxpcrimcntally.The rcsults show that the geo-metrical structural information is helpless to the associative memory of monolayeredneural networks,furthermore,it makes the error probability increased.If the geometricalstructural information of the stored patterns is necessary to be introduced,somc new ap-proaches have to be explored.展开更多
Glaucoma is a chronic and progressive optic neurodegenerative disease leading to vision deterioration and in most cases produce increased pressure within the eye. This is due to the backup of fluid in the eye; it caus...Glaucoma is a chronic and progressive optic neurodegenerative disease leading to vision deterioration and in most cases produce increased pressure within the eye. This is due to the backup of fluid in the eye; it causes damage to the optic nerve. Hence, early detection diagnosis and treatment of an eye help to prevent the loss of vision. In this paper, a novel method is proposed for the early detection of glaucoma using a combination of magnitude and phase features from the digital fundus images. Local binary patterns(LBP) and Daugman’s algorithm are used to perform the feature set extraction.The histogram features are computed for both the magnitude and phase components. The Euclidean distance between the feature vectors are analyzed to predict glaucoma. The performance of the proposed method is compared with the higher order spectra(HOS)features in terms of sensitivity, specificity, classification accuracy and execution time. The proposed system results 95.45% output for sensitivity, specificity and classification. Also, the execution time for the proposed method takes lesser time than the existing method which is based on HOS features. Hence, the proposed system is accurate, reliable and robust than the existing approach to predict the glaucoma features.展开更多
Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description cons...Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.展开更多
基金Supported by the National Natural Science Foun-dation of China (60373062 ,60573045)
文摘This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.
基金The support of this research was by Hubei Provincial Natural Science Foundation(2022CFB449)Science Research Foundation of Education Department of Hubei Province(B2020061),are gratefully acknowledged.
文摘The task of food image recognition,a nuanced subset of fine-grained image recognition,grapples with substantial intra-class variation and minimal inter-class differences.These challenges are compounded by the irregular and multi-scale nature of food images.Addressing these complexities,our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion,grounded in the ConvNeXt architecture.Our model employs hybrid attention(HA)mechanisms to pinpoint critical discriminative regions within images,substantially mitigating the influence of background noise.Furthermore,it introduces a multi-stage local fusion(MSLF)module,fostering long-distance dependencies between feature maps at varying stages.This approach facilitates the assimilation of complementary features across scales,significantly bolstering the model’s capacity for feature extraction.Furthermore,we constructed a dataset named Roushi60,which consists of 60 different categories of common meat dishes.Empirical evaluation of the ETH Food-101,ChineseFoodNet,and Roushi60 datasets reveals that our model achieves recognition accuracies of 91.12%,82.86%,and 92.50%,respectively.These figures not only mark an improvement of 1.04%,3.42%,and 1.36%over the foundational ConvNeXt network but also surpass the performance of most contemporary food image recognition methods.Such advancements underscore the efficacy of our proposed model in navigating the intricate landscape of food image recognition,setting a new benchmark for the field.
文摘This paper presents a method for lane boundaries detection which is not affected by the shadows, illumination and un-even road conditions. This method is based upon processing grayscale images using local gradient features, characteris-tic spectrum of lanes, and linear prediction. Firstly, points on the adjacent right and left lane are recognized using the local gradient descriptors. A simple linear prediction model is deployed to predict the direction of lane markers. The contribution of this paper is the use of vertical gradient image without converting into binary image(using suitable thre-shold), and introduction of characteristic lane gradient spectrum within the local window to locate the preciselane marking points along the horizontal scan line over the image. Experimental results show that this method has greater tolerance to shadows and low illumination conditions. A comparison is drawn between this method and recent methods reported in the literature.
基金This research was funded by the Science and Technology Support Plan Project of Hebei Province(grant numbers 17210803D and 19273703D)the Science and Technology Spark Project of the Hebei Seismological Bureau(grant number DZ20180402056)+1 种基金the Education Department of Hebei Province(grant number QN2018095)the Polytechnic College of Hebei University of Science and Technology.
文摘The appearance of pedestrians can vary greatly from image to image,and different pedestrians may look similar in a given image.Such similarities and variabilities in the appearance and clothing of individuals make the task of pedestrian re-identification very challenging.Here,a pedestrian re-identification method based on the fusion of local features and gait energy image(GEI)features is proposed.In this method,the human body is divided into four regions according to joint points.The color and texture of each region of the human body are extracted as local features,and GEI features of the pedestrian gait are also obtained.These features are then fused with the local and GEI features of the person.Independent distance measure learning using the cross-view quadratic discriminant analysis(XQDA)method is used to obtain the similarity of the metric function of the image pairs,and the final similarity is acquired by weight matching.Evaluation of experimental results by cumulative matching characteristic(CMC)curves reveals that,after fusion of local and GEI features,the pedestrian re-identification effect is improved compared with existing methods and is notably better than the recognition rate of pedestrian re-identification with a single feature.
基金This work was supported by National Natural Science Foundation of China,Nos.62002359 and 61836015the Beijing Advanced Discipline Fund,No.115200S001.
文摘In recent years,simultaneous localization and mapping in dynamic environments(dynamic SLAM)has attracted significant attention from both academia and industry.Some pioneering work on this technique has expanded the potential of robotic applications.Compared to standard SLAM under the static world assumption,dynamic SLAM divides features into static and dynamic categories and leverages each type of feature properly.Therefore,dynamic SLAM can provide more robust localization for intelligent robots that operate in complex dynamic environments.Additionally,to meet the demands of some high-level tasks,dynamic SLAM can be integrated with multiple object tracking.This article presents a survey on dynamic SLAM from the perspective of feature choices.A discussion of the advantages and disadvantages of different visual features is provided in this article.
文摘With the increasing popularity of high-resolution remote sensing images,the remote sensing image retrieval(RSIR)has always been a topic of major issue.A combined,global non-subsampled shearlet transform(NSST)-domain statistical features(NSSTds)and local three dimensional local ternary pattern(3D-LTP)features,is proposed for high-resolution remote sensing images.We model the NSST image coefficients of detail subbands using 2-state laplacian mixture(LM)distribution and its three parameters are estimated using Expectation-Maximization(EM)algorithm.We also calculate the statistical parameters such as subband kurtosis and skewness from detail subbands along with mean and standard deviation calculated from approximation subband,and concatenate all of them with the 2-state LM parameters to describe the global features of the image.The various properties of NSST such as multiscale,localization and flexible directional sensitivity make it a suitable choice to provide an effective approximation of an image.In order to extract the dense local features,a new 3D-LTP is proposed where dimension reduction is performed via selection of‘uniform’patterns.The 3D-LTP is calculated from spatial RGB planes of the input image.The proposed inter-channel 3D-LTP not only exploits the local texture information but the color information is captured too.Finally,a fused feature representation(NSSTds-3DLTP)is proposed using new global(NSSTds)and local(3D-LTP)features to enhance the discriminativeness of features.The retrieval performance of proposed NSSTds-3DLTP features are tested on three challenging remote sensing image datasets such as WHU-RS19,Aerial Image Dataset(AID)and PatternNet in terms of mean average precision(MAP),average normalized modified retrieval rank(ANMRR)and precision-recall(P-R)graph.The experimental results are encouraging and the NSSTds-3DLTP features leads to superior retrieval performance compared to many well known existing descriptors such as Gabor RGB,Granulometry,local binary pattern(LBP),Fisher vector(FV),vector of locally aggregated descriptors(VLAD)and median robust extended local binary pattern(MRELBP).For WHU-RS19 dataset,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{41.93%,20.87%},{92.30%,32.68%},{86.14%,31.97%},{18.18%,15.22%},{8.96%,19.60%}and{15.60%,13.26%},respectively.For AID,in terms of{MAP,ANMRR},the NSSTds-3DLTP improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{152.60%,22.06%},{226.65%,25.08%},{185.03%,23.33%},{80.06%,12.16%},{50.58%,10.49%}and{62.34%,3.24%},respectively.For PatternNet,the NSSTds-3DLTP respectively improves upon Gabor RGB,Granulometry,LBP,FV,VLAD and MRELBP descriptors by{32.79%,10.34%},{141.30%,24.72%},{17.47%,10.34%},{83.20%,19.07%},{21.56%,3.60%},and{19.30%,0.48%}in terms of{MAP,ANMRR}.The moderate dimensionality of simple NSSTds-3DLTP allows the system to run in real-time.
文摘The fingerspelling recognition by hand shape is an important step for developing a human-computer interaction system. A method of fingerspelling recognition by hand shape using HLAC (higher-order local auto-correlation) features is proposed. Furthermore, in order to use HLAC features more effectively, the use of image processing techniques: reducing an image resolution, dividing an image, and image pre-processing techniques, is also proposed. The experimental results show that the proposed method is promising.
基金Supported by Project of Shandong Province Higher Educational Science and Technology Program(No.J11LG77)
文摘In this paper, we present a tire defect detection algorithm based on sparse representation. The dictionary learned from reference images can efficiently represent the test image. As the representation coefficients of normal images have a specific distribution, the local feature can be estimate by comparing representation coefficient distribution. Meanwhile, a coding length is used to measure the global features of representation coefficients. The tire defect is located by both these local and global features. Experimental results demonstrate that the proposed method can accurately detect and locate the tire defects.
基金supported by Fundamental Research Funds for the Central Universities of China (Grant No. CDJZR10118801)
文摘Based on feature compression with orthogonal locality preserving projection(OLPP),a novel fault diagnosis model is proposed in this paper to achieve automation and high-precision of fault diagnosis of rotating machinery.With this model,the original vibration signals of training and test samples are first decomposed through the empirical mode decomposition(EMD),and Shannon entropy is constructed to achieve high-dimensional eigenvectors.In order to replace the traditional feature extraction way which does the selection manually,OLPP is introduced to automatically compress the high-dimensional eigenvectors of training and test samples into the low-dimensional eigenvectors which have better discrimination.After that,the low-dimensional eigenvectors of training samples are input into Morlet wavelet support vector machine(MWSVM) and a trained MWSVM is obtained.Finally,the low-dimensional eigenvectors of test samples are input into the trained MWSVM to carry out fault diagnosis.To evaluate our proposed model,the experiment of fault diagnosis of deep groove ball bearings is made,and the experiment results indicate that the recognition accuracy rate of the proposed diagnosis model for outer race crack、inner race crack and ball crack is more than 90%.Compared to the existing approaches,the proposed diagnosis model combines the strengths of EMD in fault feature extraction,OLPP in feature compression and MWSVM in pattern recognition,and realizes the automation and high-precision of fault diagnosis.
基金National Science Foundation of China(Nos.U1736105,61572259,41942017)The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research group no.RGP-VPP-264.
文摘Feature selection has been widely used in data mining and machine learning.Its objective is to select a minimal subset of features according to some reasonable criteria so as to solve the original task more quickly.In this article,a feature selection algorithm with local search strategy based on the forest optimization algorithm,namely FSLSFOA,is proposed.The novel local search strategy in local seeding process guarantees the quality of the feature subset in the forest.Next,the fitness function is improved,which not only considers the classification accuracy,but also considers the size of the feature subset.To avoid falling into local optimum,a novel global seeding method is attempted,which selects trees on the bottom of candidate set and gives the algorithm more diversities.Finally,FSLSFOA is compared with four feature selection methods to verify its effectiveness.Most of the results are superior to these comparative methods.
基金supported by the National Natural Science Foundation of China(61375079)
文摘This paper introduces an indoor global localization method by extending and matching features. In the proposed method, the environment is partitioned into convex subdivisions. Local extended maps of the subdivisions are then built by exten- ding features to constitute the local extended map set. While the robot is moving in the environment, the local extended map of the current local environment is established and then matched with the local extended map set. Therefore, global localization in an indoor environment can be achieved by integrating the position and ori- entation matching rates. Both theoretical analysis and comparison experimental result are provided to verify the effectiveness of the proposed method for global localization.
文摘In this research, a content-based image retrieval (CBIR) system for high resolution satellite images has been developed by using texture features. The proposed approach uses the local binary pattern (LBP) texture feature and a block based scheme. The query and database images are divided into equally sized blocks, from which LBP histograms are extracted. The block histograms are then compared by using the Chi-square distance. Experimental results show that the LBP representation provides a powerful tool for high resolution satellite images (HRSI) retrieval.
基金Foundation items:Shanghai Sailing Program,China (No. 21YF1401300)Shanghai Science and Technology Innovation Action Plan,China (No.19511101802)Fundamental Research Funds for the Central Universities,China (No.2232021D-25)。
文摘The demand for image retrieval with text manipulation exists in many fields, such as e-commerce and Internet search. Deep metric learning methods are used by most researchers to calculate the similarity between the query and the candidate image by fusing the global feature of the query image and the text feature. However, the text usually corresponds to the local feature of the query image rather than the global feature. Therefore, in this paper, we propose a framework of image retrieval with text manipulation by local feature modification(LFM-IR) which can focus on the related image regions and attributes and perform modification. A spatial attention module and a channel attention module are designed to realize the semantic mapping between image and text. We achieve excellent performance on three benchmark datasets, namely Color-Shape-Size(CSS), Massachusetts Institute of Technology(MIT) States and Fashion200K(+8.3%, +0.7% and +4.6% in R@1).
基金This research received funding from Duhok Polytechnic University.
文摘Collective improvement in the acceptable or desirable accuracy level of breast cancer image-related pattern recognition using various schemes remains challenging.Despite the combination of multiple schemes to achieve superior ultrasound image pattern recognition by reducing the speckle noise,an enhanced technique is not achieved.The purpose of this study is to introduce a features-based fusion scheme based on enhancement uniform-Local Binary Pattern(LBP)and filtered noise reduction.To surmount the above limitations and achieve the aim of the study,a new descriptor that enhances the LBP features based on the new threshold has been proposed.This paper proposes a multi-level fusion scheme for the auto-classification of the static ultrasound images of breast cancer,which was attained in two stages.First,several images were generated from a single image using the pre-processing method.Themedian andWiener filterswere utilized to lessen the speckle noise and enhance the ultrasound image texture.This strategy allowed the extraction of a powerful feature by reducing the overlap between the benign and malignant image classes.Second,the fusion mechanism allowed the production of diverse features from different filtered images.The feasibility of using the LBP-based texture feature to categorize the ultrasound images was demonstrated.The effectiveness of the proposed scheme is tested on 250 ultrasound images comprising 100 and 150 benign and malignant images,respectively.The proposed method achieved very high accuracy(98%),sensitivity(98%),and specificity(99%).As a result,the fusion process that can help achieve a powerful decision based on different features produced from different filtered images improved the results of the new descriptor of LBP features in terms of accuracy,sensitivity,and specificity.
基金supported by the Korea Research Foundation Grant funded by the Korean Government(MOEHRD),the MKE(The Ministry of Knowledge Economy,Korea)the ITRC(Information Technology Research Center)support program(NIPA-2009-(C1090-0902-0007))
文摘Vehicle detectition in still images is a comparatively difficult task. This paper presents a method for this task by using boosted local pattern detector constructed from two local features including Haar-like and oriented gradient features. The whole process is composed of three stages. In the first stage, local appearance features of vehicles and non-vehicle objects are extracted. Haar-tike and oriented gradient features are extracted separately in this stage as local features. In the second stage, Adabeost algorithm is used to select the most discriminative features as weak detectors from the two local feature sets, and a strong local pattern detector is built by the weighted combination of these selected weak detectors. Finally, vehicle detection can be performed in still images by using the boosted strong local feature detector. Experiment results show that the local pattern detector constructed in this way combines the advantages of Haar-like and oriented gradient features, and can achieve better detection results than the detector by using single Haar-like features.
基金Supported by National Natural Science Foundation of China(61071131,61271388) Natural Science Foundation of Beijing(4122040)+1 种基金 Research Project of Tsinghua University(2012Z01011) Doctoral Fund of Ministry of Education of China(20120002110036)
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
文摘In order to explore the structural features of neural networks and the ap-proaches to local interconnection,the geometrical structural information is introduced tothe Hopfield neural network model which is applied to associative memory.The dynamicsof the recalling is studied theoretically and cxpcrimcntally.The rcsults show that the geo-metrical structural information is helpless to the associative memory of monolayeredneural networks,furthermore,it makes the error probability increased.If the geometricalstructural information of the stored patterns is necessary to be introduced,somc new ap-proaches have to be explored.
文摘Glaucoma is a chronic and progressive optic neurodegenerative disease leading to vision deterioration and in most cases produce increased pressure within the eye. This is due to the backup of fluid in the eye; it causes damage to the optic nerve. Hence, early detection diagnosis and treatment of an eye help to prevent the loss of vision. In this paper, a novel method is proposed for the early detection of glaucoma using a combination of magnitude and phase features from the digital fundus images. Local binary patterns(LBP) and Daugman’s algorithm are used to perform the feature set extraction.The histogram features are computed for both the magnitude and phase components. The Euclidean distance between the feature vectors are analyzed to predict glaucoma. The performance of the proposed method is compared with the higher order spectra(HOS)features in terms of sensitivity, specificity, classification accuracy and execution time. The proposed system results 95.45% output for sensitivity, specificity and classification. Also, the execution time for the proposed method takes lesser time than the existing method which is based on HOS features. Hence, the proposed system is accurate, reliable and robust than the existing approach to predict the glaucoma features.
基金the National Natural Science Foundation of China,No.51705469the Zhengzhou University Youth Talent Enterprise Cooperative Innovation Team Support Program Project(2021,2022).
文摘Obtaining a 3D feature description with high descriptiveness and robustness under complicated nuisances is a significant and challenging task in 3D feature matching.This paper proposes a novel feature description consisting of a stable local reference frame(LRF)and a feature descriptor based on local spatial voxels.First,an improved LRF was designed by incorporating distance weights into Z-and X-axis calculations.Subsequently,based on the LRF and voxel segmentation,a feature descriptor based on voxel homogenization was proposed.Moreover,uniform segmentation of cube voxels was performed,considering the eigenvalues of each voxel and its neighboring voxels,thereby enhancing the stability of the description.The performance of the descriptor was strictly tested and evaluated on three public datasets,which exhibited high descriptiveness,robustness,and superior performance compared with other current methods.Furthermore,the descriptor was applied to a 3D registration trial,and the results demonstrated the reliability of our approach.