This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geom...This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.展开更多
In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant featur...In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant feature transform (SIFT). First,the distance between SIFT features is included in the equations of the proximity matrix to measure the similarity between two feature points; then the normalized cross correlation (NCC) used in Scott s method,which has been modified with adaptive scale and orientation,...展开更多
Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has ...Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has the ability to automatically extract control points (CPs) and is commonly used for remote sensing images. However, its results are mostly inaccurate and sometimes contain incorrect matching caused by generating a small number of false CP pairs. These CP pairs have high false alarm matching. This paper presents a modified method to improve the performance of SIFT CPs matching by applying sum of absolute difference (SAD) in a different manner for the new optical satellite generation called near-equatorial orbit satellite and multi-sensor images. The proposed method, which has a significantly high rate of correct matches, improves CP matching. The data in this study were obtained from the RazakSAT satellite a new near equatorial satellite system. The proposed method involves six steps: 1) data reduction, 2) applying the SIFT to automatically extract CPs, 3) refining CPs matching by using SAD algorithm with empirical threshold, and 4) calculation of true CPs intensity values over all image’ bands, 5) preforming a linear regression model between the intensity values of CPs locate in reverence and sensed image’ bands, 6) Relative radiometric normalization conducting using regression transformation functions. Different thresholds have experimentally tested and used in conducting this study (50 and 70), by followed the proposed method, and it removed the false extracted SIFT CPs to be from 775, 1125, 883, 804, 883 and 681 false pairs to 342, 424, 547, 706, 547, and 469 corrected and matched pairs, respectively.展开更多
On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits o...On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits of using an SIFT algorithm for target classification are discussed.Secondly,the scales of SIFT descriptors are sorted by descending as SIFT-SS,which is sent to a support vector machine(SVM) with radial based function(RBF) kernel in order to train SVM classifier,which will be used for achieving target classification.Experimental results indicate that the SIFT-SS algorithm is efficient for target classification and can obtain a higher recognition rate than affine moment invariants(AMI) and multi-scale auto-convolution(MSA) in some complex situations,such as the situation with the existence of noises and occlusions.Moreover,the computational time of SIFT-SS is shorter than MSA and longer than AMI.展开更多
Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a...Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.展开更多
The results of face recognition are often inaccurate due to factors such as illumination,noise intensity,and affine/projection transformation.In response to these problems,the scale invariant feature transformation(SI...The results of face recognition are often inaccurate due to factors such as illumination,noise intensity,and affine/projection transformation.In response to these problems,the scale invariant feature transformation(SIFT) is proposed,but its computational complexity and complication seriously affect the efficiency of the algorithm.In order to solve this problem,SIFT algorithm is proposed based on principal component analysis(PCA) dimensionality reduction.The algorithm first uses PCA algorithm,which has the function of screening feature points,to filter the feature points extracted in advance by the SIFT algorithm;then the high-dimensional data is projected into the low-dimensional space to remove the redundant feature points,thereby changing the way of generating feature descriptors and finally achieving the effect of dimensionality reduction.In this paper,through experiments on the public ORL face database,the dimension of SIFT is reduced to 20 dimensions,which improves the efficiency of face extraction;the comparison of several experimental results is completed and analyzed to verify the superiority of the improved algorithm.展开更多
Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration whe...Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.展开更多
Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based ...Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.展开更多
To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, ...To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, 3 rotation-invariant concentric-ring grids around the key-point location are used instead of 16 square grids used in the original SIFT. Then, 10 orientations are accumulated for each grid, which results in a 30-dimension descriptor. In descriptor matching, rough rejection mismatches is proposed based on the difference of grey information between matching points. The per- formance of the proposed method is tested for image mosaic on simulated and real-worid images. Experimental results show that the M-SIFT descriptor inherits the SIFT' s ability of being invariant to image scale and rotation, illumination change and affine distortion. Besides the time cost of feature extraction is reduced by 50% compared with the original SIFT. And the rough rejection mismatches can reject at least 70% of mismatches. The results also demonstrate that the performance of the pro- posed M-SIFT method is superior to other improved SIFT methods in speed and robustness.展开更多
Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are genera...Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are generally used to get SIFT descriptors in order to reduce the complexity. The regions which have a similar grayscale level but different hues tend to produce wrong matching results in this case. Therefore, the loss of color information may result in decreasing of matching ratio. An image matching algorithm based on SIFT is proposed, which adds a color offset and an exposure offset when converting color images to grayscale images in order to enhance the matching ratio. Experimental results show that the proposed algorithm can effectively differentiate the regions with different colors but the similar grayscale level, and increase the matching ratio of image matching based on SIFT. Furthermore, it does not introduce much complexity than the traditional SIFT.展开更多
This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based regi...This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.展开更多
Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible fo...Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible for single threaded im-plementation to extract local feature descriptors for high-resolution images in real time. In this paper, an approach to parallelization of the SIFT algorithm is demonstrated using NVIDIA’s Graphics Processing Unit (GPU). The parallel-ization design for SIFT on GPUs is divided into two stages, a) Algorithm de-sign-generic design strategies which focuses on data and b) Implementation de-sign-architecture specific design strategies which focuses on optimally using GPU resources for maximum occupancy. Increasing memory latency hiding, eliminating branches and data blocking achieve a significant decrease in aver-age computational time. Furthermore, it is observed via Paraver tools that our approach to parallelization while optimizing for maximum occupancy allows GPU to execute memory bound SIFT algorithm at optimal levels.展开更多
How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family ...How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family filtration in object region. First, we supply an object region by selecting a rectangle in a query image such that system returns a ranked list of images that contain the same object, retrieved from the corpus based on 100 images, as a result of the first rank. To further improve retrieval performance, we add an efficient spatial consistency stage, which is named family-based spatial consistency filtration, to re-rank the results returned by the first rank. We elaborate the performance of the retrieval system by some experiments on the dataset selected from the key frames of "TREC Video Retrieval Evaluation 2005 (TRECVID2005)". The results of experiments show that the retrieval mechanism proposed by us has vast major effect on the retrieval quality. The paper also verifies the stability of the retrieval mechanism by increasing the number of images from 100 to 2000 and realizes generalized retrieval with the object outside the dataset.展开更多
An Unmanned Aircraft System (UAS) is an aircraft or ground station that can be either remote controlled manually or is capable of flying autonomously under the guidance of pre-programmed Global Positioning System (...An Unmanned Aircraft System (UAS) is an aircraft or ground station that can be either remote controlled manually or is capable of flying autonomously under the guidance of pre-programmed Global Positioning System (GPS) waypoint flight plans or more complex onboard intelligent systems. The UAS aircrafts have recently found extensive applications in military reconnaissance and surveillance, homeland security, precision agriculture, fire monitoring and analysis, and other different kinds of aids needed in disasters. Through surveillance videos captured by a UAS digital imaging payload over the interest areas, the corresponding UAS missions can be conducted. In this paper, the authors present an effective method to detect and extract architectural buildings under rural environment from UAS video sequences. The SIFT points are chosen as image features. The planar homography is adopted as the motion model between different image frames. The proposed algorithm is tested on real UAS video data.展开更多
The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D ...The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D objects visual tracking is achieved in three stages: the first frame stage,tracking stage,and recovering stage. An SESIF based objects recognition algorithm is proposed to find initial location at both the first frame stage and recovering stage. An SESIF and Lie group based visual tracking algorithm is used to track 3D object. Experiments verify the algorithm's robustness. This algorithm will be used in the second generation of the 220kV/330kV high-voltage live-line insulator cleaning robot.展开更多
To solve the problem of wide-baseline stereo image matching based on multiple cameras,the paper puts forward an image matching method of combining maximally stable extremal regions (MSER) with Scale Invariant Feature ...To solve the problem of wide-baseline stereo image matching based on multiple cameras,the paper puts forward an image matching method of combining maximally stable extremal regions (MSER) with Scale Invariant Feature Transform (SIFT) . It uses MSER to detect feature regions instead of difference of Gaussian. After fitted into elliptical regions,those regions will be normalized into unity circles and represented with SIFT descriptors. The method estimates fundamental matrix and removes outliers by auto-maximum a posteriori sample consensus after initial matching feature points. The experimental results indicate that the method is robust to viewpoint changes,can reduce computational complexity effectively and improve matching accuracy.展开更多
Recent years have witnessed the great success of self-supervised learning(SSL)in recommendation systems.However,SSL recommender models are likely to suffer from spurious correlations,leading to poor generalization.To ...Recent years have witnessed the great success of self-supervised learning(SSL)in recommendation systems.However,SSL recommender models are likely to suffer from spurious correlations,leading to poor generalization.To mitigate spurious correlations,existing work usually pursues ID-based SSL recommendation or utilizes feature engineering to identify spurious features.Nevertheless,ID-based SSL approaches sacrifice the positive impact of invariant features,while feature engineering methods require high-cost human labeling.To address the problems,we aim to automatically mitigate the effect of spurious correlations.This objective requires to 1)automatically mask spurious features without supervision,and 2)block the negative effect transmission from spurious features to other features during SSL.To handle the two challenges,we propose an invariant feature learning framework,which first divides user-item interactions into multiple environments with distribution shifts and then learns a feature mask mechanism to capture invariant features across environments.Based on the mask mechanism,we can remove the spurious features for robust predictions and block the negative effect transmission via mask-guided feature augmentation.Extensive experiments on two datasets demonstrate the effectiveness of the proposed framework in mitigating spurious correlations and improving the generalization abilities of SSL models.展开更多
Gabor filters are generally regarded as the most bionic filters corresponding to the visual perception of human. Their filtered coefficients thus are widely utilized to represent the texture information of irises. How...Gabor filters are generally regarded as the most bionic filters corresponding to the visual perception of human. Their filtered coefficients thus are widely utilized to represent the texture information of irises. However, these wavelet-based iris representations are inevitably being misaligned in iris matching stage. In this paper, we try to improve the characteristics of bionic Gabor representations of each iris via combining the local Gabor features and the key-point descriptors of Scale Invariant Feature Transformation (SIFT), which respectively simulate the process of visual object class recognition in frequency and spatial domains. A localized approach of Gabor features is used to avoid the blocking effect in the process of image division, meanwhile a SIFT key point selection strategy is provided to remove the noises and probable misaligned key points. For the combination of these iris features, we propose a support vector regression based fusion rule, which may fuse their matching scores to a scalar score to make classification decision. The experiments on three public and self-developed iris datasets validate the discriminative ability of our multiple bionic iris features, and also demonstrate that the fusion system outperforms some state-of-the-art methods.展开更多
This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms s...This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms suitable for large displacements are introduced and compared. The influence of different displacements on computational accuracy of the three algorithms is analyzed statistically. The total optical flow of the SSIHG missile is obtained using the Scale Invariant Feature Transform (SIFT) algorithm, which is the best among the three for large displacements. After removing the rotational optical flow caused by rotation of the gimbal and missile body from the total optical flow, the remaining translational optical flow is smoothed via Kalman filtering. The circular navigation guidance (CNG) law with impact angle constraint is then obtained utilizing the smoothed translational optical flow and position of the target image. Simulations are carried out under both disturbed and undisturbed conditions, and results indicate the proposed guidance strategy for SSIHG missiles can result in a precise target hit with a desired impact angle without the need for the time-to-go parameter.展开更多
Recent advances in 3D scanning technologies allow us to acquire accurate and dense 3D scan data of large-scale environments efficiently.Currently,there are various methods for acquiring largescale 3D scan data,such as...Recent advances in 3D scanning technologies allow us to acquire accurate and dense 3D scan data of large-scale environments efficiently.Currently,there are various methods for acquiring largescale 3D scan data,such as Mobile Laser Scanning(MLS),Airborne Laser Scanning,Terrestrial Laser Scanning,photogrammetry and Structure from Motion(SfM).Especially,MLS is useful to acquire dense point clouds of road and road-side objects,and SfM is a powerful technique to reconstruct meshes with textures from a set of digital images.In this research,a registration method of point clouds from vehicle-based MLS(MLS point cloud),and textured meshes from the SfM of aerial photographs(SfM mesh),is proposed for creating high-quality surface models of urban areas by combining them.In general,SfM mesh has non-scale information;therefore,scale,position,and orientation of the SfM mesh are adjusted in the registration process.In our method,first,2D feature points are extracted from both SfM mesh and MLS point cloud.This process consists of ground-and building-plane extraction by region growing,random sample consensus and least square method,vertical edge extraction by detecting intersections between the planes,and feature point extraction by intersection tests between the ground plane and the edges.Then,the corresponding feature points between the MLS point cloud and the SfM mesh are searched efficiently,using similarity invariant features and hashing.Next,the coordinate transformation is applied to the SfM mesh so that the ground planes and corresponding feature points are adjusted.Finally,scaling Iterative Closest Point algorithm is applied for accurate registration.Experimental results for three data-sets show that our method is effective for the registration of SfM mesh and MLS point cloud of urban areas including buildings.展开更多
基金Supported by the National Natural Science Foun-dation of China (60373062 ,60573045)
文摘This paper proposes a novel robust image watermarking scheme for digital images using local invariant features and Independent Component Analysis (ICA). Most present watermarking algorithms are unable to resist geometric distortions that desynchronize the location. The method we propose here is robust to geometric attacks. In order to resist geometric distortions, we use a local invariant feature of the image called the scale invariant feature transform, which is invariant to translation and scaling distortions. The watermark is inserted into the circular patches generated by scale-invariant key point extractor. Rotation invariance is achieved using the translation property of the polar-mapped circular patches. Our method belongs to the blind watermark category, because we use Independent Component Analysis for detection that does not need the original image during detection. Experimental results show that our method is robust against geometric distortion attacks as well as signal-processing attacks.
基金National High-tech Research and Development Program (2007AA01Z314)National Natural Science Foundation of China (60873085)
文摘In order to obtain a large number of correct matches with high accuracy,this article proposes a robust wide baseline point matching method,which is based on Scott s proximity matrix and uses the scale invariant feature transform (SIFT). First,the distance between SIFT features is included in the equations of the proximity matrix to measure the similarity between two feature points; then the normalized cross correlation (NCC) used in Scott s method,which has been modified with adaptive scale and orientation,...
文摘Relative radiometric normalization (RRN) minimizes radiometric differences among images caused by inconsistencies of acquisition conditions rather than changes in surface. Scale invariant feature transform (SIFT) has the ability to automatically extract control points (CPs) and is commonly used for remote sensing images. However, its results are mostly inaccurate and sometimes contain incorrect matching caused by generating a small number of false CP pairs. These CP pairs have high false alarm matching. This paper presents a modified method to improve the performance of SIFT CPs matching by applying sum of absolute difference (SAD) in a different manner for the new optical satellite generation called near-equatorial orbit satellite and multi-sensor images. The proposed method, which has a significantly high rate of correct matches, improves CP matching. The data in this study were obtained from the RazakSAT satellite a new near equatorial satellite system. The proposed method involves six steps: 1) data reduction, 2) applying the SIFT to automatically extract CPs, 3) refining CPs matching by using SAD algorithm with empirical threshold, and 4) calculation of true CPs intensity values over all image’ bands, 5) preforming a linear regression model between the intensity values of CPs locate in reverence and sensed image’ bands, 6) Relative radiometric normalization conducting using regression transformation functions. Different thresholds have experimentally tested and used in conducting this study (50 and 70), by followed the proposed method, and it removed the false extracted SIFT CPs to be from 775, 1125, 883, 804, 883 and 681 false pairs to 342, 424, 547, 706, 547, and 469 corrected and matched pairs, respectively.
基金supported by the National High Technology Research and Development Program (863 Program) (2010AA7080302)
文摘On the basis of scale invariant feature transform(SIFT) descriptors,a novel kind of local invariants based on SIFT sequence scale(SIFT-SS) is proposed and applied to target classification.First of all,the merits of using an SIFT algorithm for target classification are discussed.Secondly,the scales of SIFT descriptors are sorted by descending as SIFT-SS,which is sent to a support vector machine(SVM) with radial based function(RBF) kernel in order to train SVM classifier,which will be used for achieving target classification.Experimental results indicate that the SIFT-SS algorithm is efficient for target classification and can obtain a higher recognition rate than affine moment invariants(AMI) and multi-scale auto-convolution(MSA) in some complex situations,such as the situation with the existence of noises and occlusions.Moreover,the computational time of SIFT-SS is shorter than MSA and longer than AMI.
基金supported by the National Natural Science Foundation of China(61471154,61876057)the Key Research and Development Program of Anhui Province-Special Project of Strengthening Science and Technology Police(202004D07020012).
文摘Person re-identification is a prevalent technology deployed on intelligent surveillance.There have been remarkable achievements in person re-identification methods based on the assumption that all person images have a sufficiently high resolution,yet such models are not applicable to the open world.In real world,the changing distance between pedestrians and the camera renders the resolution of pedestrians captured by the camera inconsistent.When low-resolution(LR)images in the query set are matched with high-resolution(HR)images in the gallery set,it degrades the performance of the pedestrian matching task due to the absent pedestrian critical information in LR images.To address the above issues,we present a dualstream coupling network with wavelet transform(DSCWT)for the cross-resolution person re-identification task.Firstly,we use the multi-resolution analysis principle of wavelet transform to separately process the low-frequency and high-frequency regions of LR images,which is applied to restore the lost detail information of LR images.Then,we devise a residual knowledge constrained loss function that transfers knowledge between the two streams of LR images and HR images for accessing pedestrian invariant features at various resolutions.Extensive qualitative and quantitative experiments across four benchmark datasets verify the superiority of the proposed approach.
基金Supported by the National Natural Science Foundation of China (No.61571222)the Natural Science Research Program of Higher Education Jiangsu Province (No.19KJD520005)+1 种基金Qing Lan Project of Jiangsu Province (Su Teacher’s Letter 2021 No.11)Jiangsu Graduate Scientific Research Innovation Program (No.KYCX21_1944)。
文摘The results of face recognition are often inaccurate due to factors such as illumination,noise intensity,and affine/projection transformation.In response to these problems,the scale invariant feature transformation(SIFT) is proposed,but its computational complexity and complication seriously affect the efficiency of the algorithm.In order to solve this problem,SIFT algorithm is proposed based on principal component analysis(PCA) dimensionality reduction.The algorithm first uses PCA algorithm,which has the function of screening feature points,to filter the feature points extracted in advance by the SIFT algorithm;then the high-dimensional data is projected into the low-dimensional space to remove the redundant feature points,thereby changing the way of generating feature descriptors and finally achieving the effect of dimensionality reduction.In this paper,through experiments on the public ORL face database,the dimension of SIFT is reduced to 20 dimensions,which improves the efficiency of face extraction;the comparison of several experimental results is completed and analyzed to verify the superiority of the improved algorithm.
基金supported by the National Natural Science Foundation of China (60802043)the National Basic Research Program of China(973 Program) (2010CB327900)
文摘Local invariant algorithm applied in downward-looking image registration,usually computes the camera's pose relative to visual landmarks.Generally,there are three requirements in the process of image registration when using these approaches.First,the algorithm is apt to be influenced by illumination.Second,algorithm should have less computational complexity.Third,the depth information of images needs to be estimated without other sensors.This paper investigates a famous local invariant feature named speeded up robust feature(SURF),and proposes a highspeed and robust image registration and localization algorithm based on it.With supports from feature tracking and pose estimation methods,the proposed algorithm can compute camera poses under different conditions of scale,viewpoint and rotation so as to precisely localize object's position.At last,the study makes registration experiment by scale invariant feature transform(SIFT),SURF and the proposed algorithm,and designs a method to evaluate their performances.Furthermore,this study makes object retrieval test on remote sensing video.For there is big deformation on remote sensing frames,the registration algorithm absorbs the Kanade-Lucas-Tomasi(KLT) 3-D coplanar calibration feature tracker methods,which can localize interesting targets precisely and efficiently.The experimental results prove that the proposed method has a higher localization speed and lower localization error rate than traditional visual simultaneous localization and mapping(vSLAM) in a period of time.
文摘Systems using numerous cameras are emerging in many fields due to their ease of production and reduced cost, and one of the fields where they are expected to be used more actively in the near future is in image-based rendering (IBR). Color correction between views is necessary to use multi-view systems in IBR to make audiences feel comfortable when views are switched or when a free viewpoint video is displayed. Color correction usually involves two steps: the first is to adjust camera parameters such as gain, brightness, and aperture before capture, and the second is to modify captured videos through image processing. This paper deals with the latter, which does not need a color pattern board. The proposed method uses scale invariant feature transform (SIFT) to detect correspondences, treats RGB channels independently, calculates lookup tables with an energy-minimization approach, and corrects captured video with these tables. The experimental results reveal that this approach works well.
基金Supported by the National Natural Science Foundation of China(60905012)
文摘To improve the performance of the scale invariant feature transform ( SIFT), a modified SIFT (M-SIFT) descriptor is proposed to realize fast and robust key-point extraction and matching. In descriptor generation, 3 rotation-invariant concentric-ring grids around the key-point location are used instead of 16 square grids used in the original SIFT. Then, 10 orientations are accumulated for each grid, which results in a 30-dimension descriptor. In descriptor matching, rough rejection mismatches is proposed based on the difference of grey information between matching points. The per- formance of the proposed method is tested for image mosaic on simulated and real-worid images. Experimental results show that the M-SIFT descriptor inherits the SIFT' s ability of being invariant to image scale and rotation, illumination change and affine distortion. Besides the time cost of feature extraction is reduced by 50% compared with the original SIFT. And the rough rejection mismatches can reject at least 70% of mismatches. The results also demonstrate that the performance of the pro- posed M-SIFT method is superior to other improved SIFT methods in speed and robustness.
基金supported by the National Natural Science Foundation of China(61271315)the State Scholarship Fund of China
文摘Image matching based on scale invariant feature transform(SIFT) is one of the most popular image matching algorithms, which exhibits high robustness and accuracy. Grayscale images rather than color images are generally used to get SIFT descriptors in order to reduce the complexity. The regions which have a similar grayscale level but different hues tend to produce wrong matching results in this case. Therefore, the loss of color information may result in decreasing of matching ratio. An image matching algorithm based on SIFT is proposed, which adds a color offset and an exposure offset when converting color images to grayscale images in order to enhance the matching ratio. Experimental results show that the proposed algorithm can effectively differentiate the regions with different colors but the similar grayscale level, and increase the matching ratio of image matching based on SIFT. Furthermore, it does not introduce much complexity than the traditional SIFT.
文摘This paper aims at providing multi-source remote sensing images registered in geometric space for image fusion.Focusing on the characteristics and differences of multi-source remote sensing images,a feature-based registration algorithm is implemented.The key technologies include image scale-space for implementing multi-scale properties,Harris corner detection for keypoints extraction,and partial intensity invariant feature descriptor(PIIFD)for keypoints description.Eventually,a multi-scale Harris-PIIFD image registration algorithm framework is proposed.The experimental results of fifteen sets of representative real data show that the algorithm has excellent,stable performance in multi-source remote sensing image registration,and can achieve accurate spatial alignment,which has strong practical application value and certain generalization ability.
文摘Scale Invariant Feature Transform (SIFT) algorithm is a widely used computer vision algorithm that detects and extracts local feature descriptors from images. SIFT is computationally intensive, making it infeasible for single threaded im-plementation to extract local feature descriptors for high-resolution images in real time. In this paper, an approach to parallelization of the SIFT algorithm is demonstrated using NVIDIA’s Graphics Processing Unit (GPU). The parallel-ization design for SIFT on GPUs is divided into two stages, a) Algorithm de-sign-generic design strategies which focuses on data and b) Implementation de-sign-architecture specific design strategies which focuses on optimally using GPU resources for maximum occupancy. Increasing memory latency hiding, eliminating branches and data blocking achieve a significant decrease in aver-age computational time. Furthermore, it is observed via Paraver tools that our approach to parallelization while optimizing for maximum occupancy allows GPU to execute memory bound SIFT algorithm at optimal levels.
基金supported by National High Technology Research and Development Program of China (863 Program)(No.2007AA01Z416)National Natural Science Foundation of China (No.60773056)+1 种基金Beijing New Star Project on Science and Technology (No.2007B071)Natural Science Foundation of Liaoning Province of China (No.20052184)
文摘How to construct an appropriate spatial consistent measurement is the key to improving image retrieval performance. To address this problem, this paper introduces a novel image retrieval mechanism based on the family filtration in object region. First, we supply an object region by selecting a rectangle in a query image such that system returns a ranked list of images that contain the same object, retrieved from the corpus based on 100 images, as a result of the first rank. To further improve retrieval performance, we add an efficient spatial consistency stage, which is named family-based spatial consistency filtration, to re-rank the results returned by the first rank. We elaborate the performance of the retrieval system by some experiments on the dataset selected from the key frames of "TREC Video Retrieval Evaluation 2005 (TRECVID2005)". The results of experiments show that the retrieval mechanism proposed by us has vast major effect on the retrieval quality. The paper also verifies the stability of the retrieval mechanism by increasing the number of images from 100 to 2000 and realizes generalized retrieval with the object outside the dataset.
文摘An Unmanned Aircraft System (UAS) is an aircraft or ground station that can be either remote controlled manually or is capable of flying autonomously under the guidance of pre-programmed Global Positioning System (GPS) waypoint flight plans or more complex onboard intelligent systems. The UAS aircrafts have recently found extensive applications in military reconnaissance and surveillance, homeland security, precision agriculture, fire monitoring and analysis, and other different kinds of aids needed in disasters. Through surveillance videos captured by a UAS digital imaging payload over the interest areas, the corresponding UAS missions can be conducted. In this paper, the authors present an effective method to detect and extract architectural buildings under rural environment from UAS video sequences. The SIFT points are chosen as image features. The planar homography is adopted as the motion model between different image frames. The proposed algorithm is tested on real UAS video data.
基金National High Technology Research and Development Programof China (863program,No.2002AA42D110-2)
文摘The 3D object visual tracking problem is studied for the robot vision system of the 220kV/330kV high-voltage live-line insulator cleaning robot. The SUSAN Edge based Scale Invariant Feature (SESIF) algorithm based 3D objects visual tracking is achieved in three stages: the first frame stage,tracking stage,and recovering stage. An SESIF based objects recognition algorithm is proposed to find initial location at both the first frame stage and recovering stage. An SESIF and Lie group based visual tracking algorithm is used to track 3D object. Experiments verify the algorithm's robustness. This algorithm will be used in the second generation of the 220kV/330kV high-voltage live-line insulator cleaning robot.
基金Sponsored by the Scientific Research Common Program of Beijing Municipal Commission of Education(Grant No. KM201010772021the National High Technology Research and Development Program of China (863 Program) (Grant No. 2006AA74105)the National Natural Science Foundation of Chi-na(Grant No. 60803103)
文摘To solve the problem of wide-baseline stereo image matching based on multiple cameras,the paper puts forward an image matching method of combining maximally stable extremal regions (MSER) with Scale Invariant Feature Transform (SIFT) . It uses MSER to detect feature regions instead of difference of Gaussian. After fitted into elliptical regions,those regions will be normalized into unity circles and represented with SIFT descriptors. The method estimates fundamental matrix and removes outliers by auto-maximum a posteriori sample consensus after initial matching feature points. The experimental results indicate that the method is robust to viewpoint changes,can reduce computational complexity effectively and improve matching accuracy.
文摘Recent years have witnessed the great success of self-supervised learning(SSL)in recommendation systems.However,SSL recommender models are likely to suffer from spurious correlations,leading to poor generalization.To mitigate spurious correlations,existing work usually pursues ID-based SSL recommendation or utilizes feature engineering to identify spurious features.Nevertheless,ID-based SSL approaches sacrifice the positive impact of invariant features,while feature engineering methods require high-cost human labeling.To address the problems,we aim to automatically mitigate the effect of spurious correlations.This objective requires to 1)automatically mask spurious features without supervision,and 2)block the negative effect transmission from spurious features to other features during SSL.To handle the two challenges,we propose an invariant feature learning framework,which first divides user-item interactions into multiple environments with distribution shifts and then learns a feature mask mechanism to capture invariant features across environments.Based on the mask mechanism,we can remove the spurious features for robust predictions and block the negative effect transmission via mask-guided feature augmentation.Extensive experiments on two datasets demonstrate the effectiveness of the proposed framework in mitigating spurious correlations and improving the generalization abilities of SSL models.
文摘Gabor filters are generally regarded as the most bionic filters corresponding to the visual perception of human. Their filtered coefficients thus are widely utilized to represent the texture information of irises. However, these wavelet-based iris representations are inevitably being misaligned in iris matching stage. In this paper, we try to improve the characteristics of bionic Gabor representations of each iris via combining the local Gabor features and the key-point descriptors of Scale Invariant Feature Transformation (SIFT), which respectively simulate the process of visual object class recognition in frequency and spatial domains. A localized approach of Gabor features is used to avoid the blocking effect in the process of image division, meanwhile a SIFT key point selection strategy is provided to remove the noises and probable misaligned key points. For the combination of these iris features, we propose a support vector regression based fusion rule, which may fuse their matching scores to a scalar score to make classification decision. The experiments on three public and self-developed iris datasets validate the discriminative ability of our multiple bionic iris features, and also demonstrate that the fusion system outperforms some state-of-the-art methods.
基金supported by the Armament Research Fund of China (No.9020A02010313BQ01)
文摘This paper focuses mainly on semi-strapdown image homing guided (SSIHG) system design based on optical flow for a six-degree-of-freedom (6-DOF) axial-symmetric skid-to-turn missile. Three optical flow algorithms suitable for large displacements are introduced and compared. The influence of different displacements on computational accuracy of the three algorithms is analyzed statistically. The total optical flow of the SSIHG missile is obtained using the Scale Invariant Feature Transform (SIFT) algorithm, which is the best among the three for large displacements. After removing the rotational optical flow caused by rotation of the gimbal and missile body from the total optical flow, the remaining translational optical flow is smoothed via Kalman filtering. The circular navigation guidance (CNG) law with impact angle constraint is then obtained utilizing the smoothed translational optical flow and position of the target image. Simulations are carried out under both disturbed and undisturbed conditions, and results indicate the proposed guidance strategy for SSIHG missiles can result in a precise target hit with a desired impact angle without the need for the time-to-go parameter.
基金This work was partially supported by JSPS KAKENHI[grant number 26420073].
文摘Recent advances in 3D scanning technologies allow us to acquire accurate and dense 3D scan data of large-scale environments efficiently.Currently,there are various methods for acquiring largescale 3D scan data,such as Mobile Laser Scanning(MLS),Airborne Laser Scanning,Terrestrial Laser Scanning,photogrammetry and Structure from Motion(SfM).Especially,MLS is useful to acquire dense point clouds of road and road-side objects,and SfM is a powerful technique to reconstruct meshes with textures from a set of digital images.In this research,a registration method of point clouds from vehicle-based MLS(MLS point cloud),and textured meshes from the SfM of aerial photographs(SfM mesh),is proposed for creating high-quality surface models of urban areas by combining them.In general,SfM mesh has non-scale information;therefore,scale,position,and orientation of the SfM mesh are adjusted in the registration process.In our method,first,2D feature points are extracted from both SfM mesh and MLS point cloud.This process consists of ground-and building-plane extraction by region growing,random sample consensus and least square method,vertical edge extraction by detecting intersections between the planes,and feature point extraction by intersection tests between the ground plane and the edges.Then,the corresponding feature points between the MLS point cloud and the SfM mesh are searched efficiently,using similarity invariant features and hashing.Next,the coordinate transformation is applied to the SfM mesh so that the ground planes and corresponding feature points are adjusted.Finally,scaling Iterative Closest Point algorithm is applied for accurate registration.Experimental results for three data-sets show that our method is effective for the registration of SfM mesh and MLS point cloud of urban areas including buildings.