A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low freq...A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.展开更多
To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photograp...To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photographed by a fundus microscope. The kernel of this new algorithm is the scale-,rotation-and illumination-invariant interest point detector & feature descriptor-Scale-Invariant Feature Transform. When matched interest points according to second-nearest-neighbor strategy,the parameters of the model are estimated using the correct matches of the interest points,extracted by a new inlier identification scheme based on Sampson distance from putative sets. In order to preserve image features,bilinear warping and multi-band blending techniques are used to create panoramic retinal images. Experiments show that the proposed method works well with rejection error in 0.3 pixels,even for those cases where the retinal images without discernable vascular structure in contrast to the state-of-the-art algorithms.展开更多
Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which ...Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which suffer from illumination, rotation, and source differences. The scale-invariant feature transform (SIFT) algorithm has been used successfully in satellite image registration problems. Also, many researchers have applied a local SIFT descriptor to improve the image retrieval process. Despite its robustness, this algorithm has some difficulties with the quality and quantity of the extracted local feature points in multisource remote sensing. Furthermore, high dimensionality of the local features extracted by SIFT results in time-consuming computational processes alongside high storage requirements for saving the relevant information, which are important factors in content-based image retrieval (CBIR) applications. In this paper, a novel method is introduced to transform the local SIFT features to global features for multisource remote sensing. The quality and quantity of SIFT local features have been enhanced by applying contrast equalization on images in a pre-processing stage. Considering the local features of each image in the reference database as a separate class, linear discriminant analysis (LDA) is used to transform the local features to global features while reducing di- mensionality of the feature space. This will also significantly reduce the computational time and storage required. Applying the trained kernel on verification data and mapping them showed a successful retrieval rate of 91.67% for test feature points.展开更多
In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines ...In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines the Basis-Pursuit algorithm and the Total-Variation regularization scheme to extract the cartoon part containing basic geometrical information from the original image, and is stable and unsusceptible to noise interference. Then a smaller number of the distinctive key points will be obtained by using the SIFT algorithm based on the cartoon part of the original image. Matching the key points by the constrained Euclidean distance,we will obtain a more correct and robust matching result. The experimental results show that the geometrical transform parameters inferred by the matched key points based on MCA+SIFT registration method are more exact than the ones based on the direct SIFT algorithm.展开更多
Analysis and recognition of ancient scripts is a challenging task as these scripts are inscribed on pillars,stones,or leaves.Optical recognition systems can help in preserving,sharing,and accelerate the study of the a...Analysis and recognition of ancient scripts is a challenging task as these scripts are inscribed on pillars,stones,or leaves.Optical recognition systems can help in preserving,sharing,and accelerate the study of the ancient scripts,but lack of standard dataset for such scripts is a major constraint.Although many scholars and researchers have captured and uploaded inscription images on various websites,manual searching,downloading and extraction of these images is tedious and error prone.Web search queries return a vast number of irrelevant results,and manually extracting images for a specific script is not scalable.This paper proposes a novelmultistage system to identify the specific set of script images from a large set of images downloaded from web sources.The proposed system combines the two most important pattern matching techniques-Scale Invariant Feature Transform(SIFT)and Template matching,in a sequential pipeline,and by using the key strengths of each technique,the system can discard irrelevant images while retaining a specific type of images.展开更多
Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combin...Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.展开更多
This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision te...This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision techniques, the translation is only known up to a scale factor, and a single scale factor is assumed for the whole sequence of images if only one camera is used. If an extra camera is available, stereo vision based reconstruction can be obtained by binocular views. If the baseline of the stereo setup is known, the scale factor problem is solved. We found that direct application of classical bundle adjustment on the constraints inherent between the binocular views has not been tested. Our method incorporated this constraint into the conventional bundle adjustment method. This special binocular bundle adjustment has been performed on image sequences similar to planet terrain circumstances. Experimental results show that our special method enhances not only the localization accuracy, but also the terrain mapping quality.展开更多
This paper presents a biologically inspired local image descriptor that combines color and shape features. Compared with previous descriptors, red-cyan cells associated with L, M, and S cones (L for long, M for mediu...This paper presents a biologically inspired local image descriptor that combines color and shape features. Compared with previous descriptors, red-cyan cells associated with L, M, and S cones (L for long, M for medium, and S for short) are used to indicate one of the opponent color channels. Stepping forward from state-of-the-art color feature extraction, we exploit a new approach to compute the color orientation and magnitudes of three opponent color channels, namely, red-green, blue-yellow, and red-cyan, in two-dimensional space. Color orientation is calculated in histograms with magnitude weighting. We linearly concatenate the four-color-opponent-channel histogram and scale-invariant-feamre-transform histogram in the final step. We apply our biologically inspired descriptor to describe the local image feature. Quantitative comparisons with state-of-the-art descriptors demonstrate the significant advantages of maintaining invariance to photometric and geometric changes in image matching, particularly in cases, such as illumination variation and image blurring, where more color contrast information is observed.展开更多
The scale-invariant feature transform (SIFT) is often applied to extract tie-points for airborne SAR images. When a pair of airborne SAR images differs with look angles obviously, shadow sizes and shapes of same objec...The scale-invariant feature transform (SIFT) is often applied to extract tie-points for airborne SAR images. When a pair of airborne SAR images differs with look angles obviously, shadow sizes and shapes of same objects will differ obviously. In main and slave SAR images, key-points around shadows often match as tie-points, although they are not homologous points. The phenomenon worsens the performance of SIFT on SAR images. On the basis of SIFT, a modified matching method is proposed to decrease the number of incorrect tie-points. High-resolution airborne SAR images are used in Experiments. Experiment results show that the proposed method is very effective to extract correct tie-points for SAR images.展开更多
Since the outbreak of Coronavirus Disease 2019(COVID-19),people are recommended to wear facial masks to limit the spread of the virus.Under the circumstances,traditional face recognition technologies cannot achieve sa...Since the outbreak of Coronavirus Disease 2019(COVID-19),people are recommended to wear facial masks to limit the spread of the virus.Under the circumstances,traditional face recognition technologies cannot achieve satisfactory results.In this paper,we propose a face recognition algorithm that combines the traditional features and deep features of masked faces.For traditional features,we extract Local Binary Pattern(LBP),Scale-Invariant Feature Transform(SIFT)and Histogram of Oriented Gradient(HOG)features from the periocular region,and use the Support Vector Machines(SVM)classifier to perform personal identification.We also propose an improved Convolutional Neural Network(CNN)model Angular Visual Geometry Group Network(A-VGG)to learn deep features.Then we use the decision-level fusion to combine the four features.Comprehensive experiments were carried out on databases of real masked faces and simulated masked faces,including frontal and side faces taken at different angles.Images with motion blur were also tested to evaluate the robustness of the algorithm.Besides,the experiment of matching a masked face with the corresponding full face is accomplished.The experimental results show that the proposed algorithm has state-of-the-art performance in masked face recognition,and the periocular region has rich biological features and high discrimination.展开更多
Road visual navigation relies on accurate road models.This study was aimed at proposing an improved scale-invariant feature transform(SIFT)algorithm for recovering depth information from farmland road images,which wou...Road visual navigation relies on accurate road models.This study was aimed at proposing an improved scale-invariant feature transform(SIFT)algorithm for recovering depth information from farmland road images,which would provide a reliable path for visual navigation.The mean image of pixel value in five channels(R,G,B,S and V)were treated as the inspected image and the feature points of the inspected image were extracted by the Canny algorithm,for achieving precise location of the feature points and ensuring the uniformity and density of the feature points.The mean value of the pixels in 5×5 neighborhood around the feature point at an interval of 45ºin eight directions was then treated as the feature vector,and the differences of the feature vectors were calculated for preliminary matching of the left and right image feature points.In order to achieve the depth information of farmland road images,the energy method of feature points was used for eliminating the mismatched points.Experiments with a binocular stereo vision system were conducted and the results showed that the matching accuracy and time consuming for depth recovery when using the improved SIFT algorithm were 96.48%and 5.6 s,respectively,with the accuracy for depth recovery of-7.17%-2.97%in a certain sight distance.The mean uniformity,time consuming and matching accuracy for all the 60 images under various climates and road conditions were 50%-70%,5.0-6.5 s,and higher than 88%,respectively,indicating that performance for achieving the feature points(e.g.,uniformity,matching accuracy,and algorithm real-time)of the improved SIFT algorithm were superior to that of conventional SIFT algorithm.This study provides an important reference for navigation technology of agricultural equipment based on machine vision.展开更多
The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performanc...The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performance of the improved GC(IGC) to image matching is studied through extensive experiments on the Oxford A?ne dataset.Empirical results indicate that the proposed IGC yields quite stable and robust results,signi?cantly outperforms the original GC,and also can outperform the classical scale-invariant feature transform(SIFT) in most of the test cases.By integrating the IGC to the SIFT,the resulting of hybrid SIFT+IGC performs best over all other single descriptors in these experimental evaluations with various geometric transformations.展开更多
This paper describes a person identifcation method for a mobile robot which performs specifc person following under dynamic complicated environments like a school canteen where many persons exist.We propose a distance...This paper describes a person identifcation method for a mobile robot which performs specifc person following under dynamic complicated environments like a school canteen where many persons exist.We propose a distance-dependent appearance model which is based on scale-invariant feature transform(SIFT) feature.SIFT is a powerful image feature that is invariant to scale and rotation in the image plane and also robust to changes of lighting condition.However,the feature is weak against afne transformations and the identifcation power will thus be degraded when the pose of a person changes largely.We therefore use a set of images taken from various directions to cope with pose changes.Moreover,the number of SIFT feature matches between the model and an input image will decrease as the person becomes farther away from the camera.Therefore,we also use a distance-dependent threshold.The person following experiment was conducted using an actual mobile robot,and the quality assessment of person identifcation was performed.展开更多
This paper presents a novel formulation for detecting objects with articulated rigid bodies from highresolution monitoring images, particularly engineering vehicles. There are many pixels in high-resolution monitoring...This paper presents a novel formulation for detecting objects with articulated rigid bodies from highresolution monitoring images, particularly engineering vehicles. There are many pixels in high-resolution monitoring images, and most of them represent the background. Our method first detects ob ject patches from monitoring images using a coarse detection process. In this phase, we build a descriptor based on histograms of oriented gradient, which contain color frequency information. Then we use a linear support vector machine to rapidly detect many image patches that may contain ob ject parts, with a low false negative rate and a high false positive rate. In the second phase, we apply a refinement classification to determine the patches that actually contain ob jects. In this stage, we increase the size of the image patches so that they include the complete ob ject using models of the ob ject parts.Then an accelerated and improved salient mask is used to improve the performance of the dense scale-invariant feature transform descriptor. The detection process returns the absolute position of positive ob jects in the original images. We have applied our methods to three datasets to demonstrate their effectiveness.展开更多
基金supported by the National Natural Science Foundation of China (6117212711071002)+1 种基金the Specialized Research Fund for the Doctoral Program of Higher Education (20113401110006)the Innovative Research Team of 211 Project in Anhui University (KJTD007A)
文摘A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.
基金Program for NewCentury Excellent Talents in UniversityGrant number:50051+1 种基金The Key Project for Technology Research of Ministry Education of ChinaCrant number:106030
文摘To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photographed by a fundus microscope. The kernel of this new algorithm is the scale-,rotation-and illumination-invariant interest point detector & feature descriptor-Scale-Invariant Feature Transform. When matched interest points according to second-nearest-neighbor strategy,the parameters of the model are estimated using the correct matches of the interest points,extracted by a new inlier identification scheme based on Sampson distance from putative sets. In order to preserve image features,bilinear warping and multi-band blending techniques are used to create panoramic retinal images. Experiments show that the proposed method works well with rejection error in 0.3 pixels,even for those cases where the retinal images without discernable vascular structure in contrast to the state-of-the-art algorithms.
文摘Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which suffer from illumination, rotation, and source differences. The scale-invariant feature transform (SIFT) algorithm has been used successfully in satellite image registration problems. Also, many researchers have applied a local SIFT descriptor to improve the image retrieval process. Despite its robustness, this algorithm has some difficulties with the quality and quantity of the extracted local feature points in multisource remote sensing. Furthermore, high dimensionality of the local features extracted by SIFT results in time-consuming computational processes alongside high storage requirements for saving the relevant information, which are important factors in content-based image retrieval (CBIR) applications. In this paper, a novel method is introduced to transform the local SIFT features to global features for multisource remote sensing. The quality and quantity of SIFT local features have been enhanced by applying contrast equalization on images in a pre-processing stage. Considering the local features of each image in the reference database as a separate class, linear discriminant analysis (LDA) is used to transform the local features to global features while reducing di- mensionality of the feature space. This will also significantly reduce the computational time and storage required. Applying the trained kernel on verification data and mapping them showed a successful retrieval rate of 91.67% for test feature points.
基金the National Science Foundation of China(No.61471185)the Natural Science Foundation of Shandong Province(No.ZR2016FM21)+1 种基金Shandong Province Science and Technology Plan Project(No.2015GSF116001)Yantai City Key Research and Development Plan Project(Nos.2014ZH157 and2016ZH057)
文摘In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines the Basis-Pursuit algorithm and the Total-Variation regularization scheme to extract the cartoon part containing basic geometrical information from the original image, and is stable and unsusceptible to noise interference. Then a smaller number of the distinctive key points will be obtained by using the SIFT algorithm based on the cartoon part of the original image. Matching the key points by the constrained Euclidean distance,we will obtain a more correct and robust matching result. The experimental results show that the geometrical transform parameters inferred by the matched key points based on MCA+SIFT registration method are more exact than the ones based on the direct SIFT algorithm.
文摘Analysis and recognition of ancient scripts is a challenging task as these scripts are inscribed on pillars,stones,or leaves.Optical recognition systems can help in preserving,sharing,and accelerate the study of the ancient scripts,but lack of standard dataset for such scripts is a major constraint.Although many scholars and researchers have captured and uploaded inscription images on various websites,manual searching,downloading and extraction of these images is tedious and error prone.Web search queries return a vast number of irrelevant results,and manually extracting images for a specific script is not scalable.This paper proposes a novelmultistage system to identify the specific set of script images from a large set of images downloaded from web sources.The proposed system combines the two most important pattern matching techniques-Scale Invariant Feature Transform(SIFT)and Template matching,in a sequential pipeline,and by using the key strengths of each technique,the system can discard irrelevant images while retaining a specific type of images.
基金supported by National Natural Science Foundation of China(No.61103123)Scientific Research Foundation for the Returned Overseas Chinese Scholars,State Education Ministry
文摘Most of the exist action recognition methods mainly utilize spatio-temporal descriptors of single interest point while ignoring their potential integral information, such as spatial distribution information. By combining local spatio-temporal feature and global positional distribution information(PDI) of interest points, a novel motion descriptor is proposed in this paper. The proposed method detects interest points by using an improved interest point detection method. Then, 3-dimensional scale-invariant feature transform(3D SIFT) descriptors are extracted for every interest point. In order to obtain a compact description and efficient computation, the principal component analysis(PCA) method is utilized twice on the 3D SIFT descriptors of single frame and multiple frames. Simultaneously, the PDI of the interest points are computed and combined with the above features. The combined features are quantified and selected and finally tested by using the support vector machine(SVM) recognition algorithm on the public KTH dataset. The testing results have showed that the recognition rate has been significantly improved and the proposed features can more accurately describe human motion with high adaptability to scenarios.
基金the National Natural Science Foundation of China (Nos. 60505017 and 60534070)the Science Planning Project of Zhejiang Province, China (No. 2005C14008)
文摘This paper presents a pure vision based technique for 3D reconstruction of planet terrain. The reconstruction accuracy depends ultimately on an optimization technique known as 'bundle adjustment'. In vision techniques, the translation is only known up to a scale factor, and a single scale factor is assumed for the whole sequence of images if only one camera is used. If an extra camera is available, stereo vision based reconstruction can be obtained by binocular views. If the baseline of the stereo setup is known, the scale factor problem is solved. We found that direct application of classical bundle adjustment on the constraints inherent between the binocular views has not been tested. Our method incorporated this constraint into the conventional bundle adjustment method. This special binocular bundle adjustment has been performed on image sequences similar to planet terrain circumstances. Experimental results show that our special method enhances not only the localization accuracy, but also the terrain mapping quality.
基金Acknowledgment This study was supported by the National Natural Science Foundation of China (grant 61101155) and the Jilin Province Science and Technology Development Program (20101504).
文摘This paper presents a biologically inspired local image descriptor that combines color and shape features. Compared with previous descriptors, red-cyan cells associated with L, M, and S cones (L for long, M for medium, and S for short) are used to indicate one of the opponent color channels. Stepping forward from state-of-the-art color feature extraction, we exploit a new approach to compute the color orientation and magnitudes of three opponent color channels, namely, red-green, blue-yellow, and red-cyan, in two-dimensional space. Color orientation is calculated in histograms with magnitude weighting. We linearly concatenate the four-color-opponent-channel histogram and scale-invariant-feamre-transform histogram in the final step. We apply our biologically inspired descriptor to describe the local image feature. Quantitative comparisons with state-of-the-art descriptors demonstrate the significant advantages of maintaining invariance to photometric and geometric changes in image matching, particularly in cases, such as illumination variation and image blurring, where more color contrast information is observed.
基金Supported by the National Key Research and Development Program of China(No.2016YFB0502502)the Special Research and Trial Production Project of Sanya(No.sy17xs0113)
文摘The scale-invariant feature transform (SIFT) is often applied to extract tie-points for airborne SAR images. When a pair of airborne SAR images differs with look angles obviously, shadow sizes and shapes of same objects will differ obviously. In main and slave SAR images, key-points around shadows often match as tie-points, although they are not homologous points. The phenomenon worsens the performance of SIFT on SAR images. On the basis of SIFT, a modified matching method is proposed to decrease the number of incorrect tie-points. High-resolution airborne SAR images are used in Experiments. Experiment results show that the proposed method is very effective to extract correct tie-points for SAR images.
基金Supported by the Postgraduate Research and Practice Innovation Program of Nanjing University of Aeronautics and Astronautics(XCXJH20220318)。
文摘Since the outbreak of Coronavirus Disease 2019(COVID-19),people are recommended to wear facial masks to limit the spread of the virus.Under the circumstances,traditional face recognition technologies cannot achieve satisfactory results.In this paper,we propose a face recognition algorithm that combines the traditional features and deep features of masked faces.For traditional features,we extract Local Binary Pattern(LBP),Scale-Invariant Feature Transform(SIFT)and Histogram of Oriented Gradient(HOG)features from the periocular region,and use the Support Vector Machines(SVM)classifier to perform personal identification.We also propose an improved Convolutional Neural Network(CNN)model Angular Visual Geometry Group Network(A-VGG)to learn deep features.Then we use the decision-level fusion to combine the four features.Comprehensive experiments were carried out on databases of real masked faces and simulated masked faces,including frontal and side faces taken at different angles.Images with motion blur were also tested to evaluate the robustness of the algorithm.Besides,the experiment of matching a masked face with the corresponding full face is accomplished.The experimental results show that the proposed algorithm has state-of-the-art performance in masked face recognition,and the periocular region has rich biological features and high discrimination.
基金This work was financially supported by the Zhejiang Science and Technology Department Basic Public Welfare Research Project(LGN18F030001)the Major Project of Zhejiang Science and Technology Department(2016C02G2100540).
文摘Road visual navigation relies on accurate road models.This study was aimed at proposing an improved scale-invariant feature transform(SIFT)algorithm for recovering depth information from farmland road images,which would provide a reliable path for visual navigation.The mean image of pixel value in five channels(R,G,B,S and V)were treated as the inspected image and the feature points of the inspected image were extracted by the Canny algorithm,for achieving precise location of the feature points and ensuring the uniformity and density of the feature points.The mean value of the pixels in 5×5 neighborhood around the feature point at an interval of 45ºin eight directions was then treated as the feature vector,and the differences of the feature vectors were calculated for preliminary matching of the left and right image feature points.In order to achieve the depth information of farmland road images,the energy method of feature points was used for eliminating the mismatched points.Experiments with a binocular stereo vision system were conducted and the results showed that the matching accuracy and time consuming for depth recovery when using the improved SIFT algorithm were 96.48%and 5.6 s,respectively,with the accuracy for depth recovery of-7.17%-2.97%in a certain sight distance.The mean uniformity,time consuming and matching accuracy for all the 60 images under various climates and road conditions were 50%-70%,5.0-6.5 s,and higher than 88%,respectively,indicating that performance for achieving the feature points(e.g.,uniformity,matching accuracy,and algorithm real-time)of the improved SIFT algorithm were superior to that of conventional SIFT algorithm.This study provides an important reference for navigation technology of agricultural equipment based on machine vision.
基金the National Natural Science Foundation of China(Nos.60970109 and 61170228)
文摘The global context(GC) descriptor is improved for describing interest regions,uses gradient orientation for binning,and thus provides more robust invariance for geometric and photometric transformations.The performance of the improved GC(IGC) to image matching is studied through extensive experiments on the Oxford A?ne dataset.Empirical results indicate that the proposed IGC yields quite stable and robust results,signi?cantly outperforms the original GC,and also can outperform the classical scale-invariant feature transform(SIFT) in most of the test cases.By integrating the IGC to the SIFT,the resulting of hybrid SIFT+IGC performs best over all other single descriptors in these experimental evaluations with various geometric transformations.
基金supported by JSPS KAKENHI (No.23700203) and NEDO Intelligent RT Software Project
文摘This paper describes a person identifcation method for a mobile robot which performs specifc person following under dynamic complicated environments like a school canteen where many persons exist.We propose a distance-dependent appearance model which is based on scale-invariant feature transform(SIFT) feature.SIFT is a powerful image feature that is invariant to scale and rotation in the image plane and also robust to changes of lighting condition.However,the feature is weak against afne transformations and the identifcation power will thus be degraded when the pose of a person changes largely.We therefore use a set of images taken from various directions to cope with pose changes.Moreover,the number of SIFT feature matches between the model and an input image will decrease as the person becomes farther away from the camera.Therefore,we also use a distance-dependent threshold.The person following experiment was conducted using an actual mobile robot,and the quality assessment of person identifcation was performed.
基金supported by the China Knowledge Centre for Engineering Sciences and Technology(No.CKCEST-2014-1-2)the Zhejiang Provincial Natural Science Foundation of China(No.LY14F020027)the National Natural Science Foundation of China(No.61272304)
文摘This paper presents a novel formulation for detecting objects with articulated rigid bodies from highresolution monitoring images, particularly engineering vehicles. There are many pixels in high-resolution monitoring images, and most of them represent the background. Our method first detects ob ject patches from monitoring images using a coarse detection process. In this phase, we build a descriptor based on histograms of oriented gradient, which contain color frequency information. Then we use a linear support vector machine to rapidly detect many image patches that may contain ob ject parts, with a low false negative rate and a high false positive rate. In the second phase, we apply a refinement classification to determine the patches that actually contain ob jects. In this stage, we increase the size of the image patches so that they include the complete ob ject using models of the ob ject parts.Then an accelerated and improved salient mask is used to improve the performance of the dense scale-invariant feature transform descriptor. The detection process returns the absolute position of positive ob jects in the original images. We have applied our methods to three datasets to demonstrate their effectiveness.