Traditional feature-based image stitching techniques often encounter obstacles when dealing with images lackingunique attributes or suffering from quality degradation. The scarcity of annotated datasets in real-life s...Traditional feature-based image stitching techniques often encounter obstacles when dealing with images lackingunique attributes or suffering from quality degradation. The scarcity of annotated datasets in real-life scenesseverely undermines the reliability of supervised learning methods in image stitching. Furthermore, existing deeplearning architectures designed for image stitching are often too bulky to be deployed on mobile and peripheralcomputing devices. To address these challenges, this study proposes a novel unsupervised image stitching methodbased on the YOLOv8 (You Only Look Once version 8) framework that introduces deep homography networksand attentionmechanisms. Themethodology is partitioned into three distinct stages. The initial stage combines theattention mechanism with a pooling pyramid model to enhance the detection and recognition of compact objectsin images, the task of the deep homography networks module is to estimate the global homography of the inputimages consideringmultiple viewpoints. The second stage involves preliminary stitching of the masks generated inthe initial stage and further enhancement through weighted computation to eliminate common stitching artifacts.The final stage is characterized by adaptive reconstruction and careful refinement of the initial stitching results.Comprehensive experiments acrossmultiple datasets are executed tometiculously assess the proposed model. Ourmethod’s Peak Signal-to-Noise Ratio (PSNR) and Structure Similarity Index Measure (SSIM) improved by 10.6%and 6%. These experimental results confirm the efficacy and utility of the presented model in this paper.展开更多
Stereo matching is an important research area in stereovision and stereo matching of curved surface is especially crucial A novel correspondence algorithm is presented and its matching uncertainty is computed robustly...Stereo matching is an important research area in stereovision and stereo matching of curved surface is especially crucial A novel correspondence algorithm is presented and its matching uncertainty is computed robustly for feature points of curved surface. The comers are matched by using homography constraint besides epipolar constraint to solve the occlusion problem. The uncertainty sources are analyzed. A cost function is established and acts as an optimal rule to compute the matching uncertainty. An adaptive scheme Gauss weights are put forward to make the matching results robust to noises. It makes the practical application of comer matching possible. From the experimental results of an image pair of curved surface it is shown that computing uncertainty robustly can restrain the affection caused by noises to the matching precision.展开更多
The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints...The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints. After initial correspondences are built via the epipolar constraint, many point-to-point image mappings called homographies are set up to predict the matching position for feature points. To refine the predictions and reject false correspondences, four schemes are proposed. Extensive experiments on simulated data as well as on real images of scenes of variant depths show that the proposed method is effective and robust.展开更多
In order to improve the user’s satisfaction with the augmented reality (AR) technology and the accuracy of the service, it is important to obtain the exact position of the user. Frequently used techniques for finding...In order to improve the user’s satisfaction with the augmented reality (AR) technology and the accuracy of the service, it is important to obtain the exact position of the user. Frequently used techniques for finding outdoors locations is the global positioning system (GPS), which is less accurate indoors. Therefore, an indoor position is measured by comparing the reception level about access point (AP) signal of wireless fidelity (Wi-Fi) or using bluetooth low energy (BLE) tags. However, Wi-Fi and Bluetooth require additional hardware installation. In this paper, the proposed method of estimating the user’s position uses an indoor image and indoor coordinate map without additional hardware installation. The indoor image has several feature points extracted from fixed objects. By matching the feature points with the feature points of the user image, we can obtain the position of the user on the Indoor map by obtaining six or more pixel coordinates from the user image and solving the solution using the perspective projection formula. The experimental results show that the user position can be obtained more accurately in the indoor environment by using only the software without additional hardware installation.展开更多
Template matching is a fundamental task in computer vision and has been studied for decades.It plays an essential role in manufacturing industry for estimating the poses of different parts,facilitating downstream task...Template matching is a fundamental task in computer vision and has been studied for decades.It plays an essential role in manufacturing industry for estimating the poses of different parts,facilitating downstream tasks such as robotic grasping.Existing methods fail when the template and source images have different modalities,cluttered backgrounds,or weak textures.They also rarely consider geometric transformations via homographies,which commonly exist even for planar industrial parts.To tackle the challenges,we propose an accurate template matching method based on differentiable coarse-tofine correspondence refinement.We use an edge-aware module to overcome the domain gap between the mask template and the grayscale image,allowing robust matching.An initial warp is estimated using coarse correspondences based on novel structure-aware information provided by transformers.This initial alignment is passed to a refinement network using references and aligned images to obtain sub-pixel level correspondences which are used to give the final geometric transformation.Extensive evaluation shows that our method to be significantly better than state-of-the-art methods and baselines,providing good generalization ability and visually plausible results even on unseen real data.展开更多
Oral endoscope image stitching algorithm is studied to obtain wide-field oral images through regis-tration and stitching,which is of great significance for auxiliary diagnosis.Compared with natural images,oral images ...Oral endoscope image stitching algorithm is studied to obtain wide-field oral images through regis-tration and stitching,which is of great significance for auxiliary diagnosis.Compared with natural images,oral images have lower textures and fewer features.However,traditional feature-based image stitching methods rely heavily on feature extraction quality,often showing an unsatisfactory performance when stitching images with few features.Moreover,due to the hand-held shooting,there are large depth and perspective disparities between the captured images,which also pose a challenge to image stitching.To overcome the above problems,we propose an unsupervised oral endoscope image stitching algorithm based on the extraction of overlapping regions and the loss of deep features.In the registration stage,we extract the overlapping region of the input images by sketching polygon intersection for feature points screening and estimate homography from coarse to fine on a three-layer feature pyramid structure.Moreover,we calculate loss using deep features instead of pixel values to emphasize the importance of depth disparities in homography estimation.Finally,we reconstruct the stitched images from feature to pixel,which can eliminate artifacts caused by large parallax.Our method is compared with both feature-based and previous deep-based methods on the UDIS-D dataset and our oral endoscopy image dataset.The experimental results show that our algorithm can achieve higher homography estimation accuracy,and better visual quality,and can be effectively applied to oral endoscope image stitching.展开更多
The plane metrology using a single uncalibrated image is studied in the paper, and three novel approaches are proposed. The first approach, namely key-line-based method, is an improvement over the widely used key-poin...The plane metrology using a single uncalibrated image is studied in the paper, and three novel approaches are proposed. The first approach, namely key-line-based method, is an improvement over the widely used key-point-based method, which uses line correspondences directly to compute homography between the world plane and its image so as to increase the computational accuracy. The second and third approaches are both based on a pair of vanishing points from two orthogonal sets of parallel lines in the space plane together with two unparallel referential distances, but the two methods deal with the problem in different ways. One is from the algebraic viewpoint which first maps the image points to an affine space via a transformation constructed from the vanishing points, and then computes the metric distance according to the relationship between the affine space and the Euclidean space, while the other is from the geometrical viewpoint based on the invariance of cross ratios. The second and third methods avoid the selection of control points and are widely applicable. In addition, a brief description on how to retrieve other geometrical entities on the space plane, such as distance from a point to a line, angle formed by two lines, etc., is also presented in the paper. Extensive experiments on simulated data as well as on real images show that the first and the second approaches are of better precision and stronger robustness than the key-point-based one and the third one, since these two approaches are fundamentally based on line information.展开更多
In the traditional manifold mosaic, a single center strip is clipped out from each source image to create a large image. Therefore the displacement between neighboring views should be very small in order to fulfill ef...In the traditional manifold mosaic, a single center strip is clipped out from each source image to create a large image. Therefore the displacement between neighboring views should be very small in order to fulfill effective strips cutting. In this paper, a method is proposed to create a manifold mosaic by images with relative large displacement by means of cutting out multiple strips in the overlap area according to the homography between images. These strips are then warped together to create a smooth mosaic. An improved RANSAC algorithm is also presented in order to improve the precision of homography calculation. Experimental results demonstrate the efficiency of the method.展开更多
In this paper,we propose a new algorithm to establish the data association between a camera and a 2-D Light Detection And Ranging sensor (LIDAR).In contrast to the previous works,where data association is establishe...In this paper,we propose a new algorithm to establish the data association between a camera and a 2-D Light Detection And Ranging sensor (LIDAR).In contrast to the previous works,where data association is established by calibrating the intrinsic parameters of the camera and the extrinsic parameters of the camera and the LIDAR,we formulate the map between laser points and pixels as a 2-D homography.The line-point correspondence is employed to construct geometric constraint on the homography matrix.This enables checkerboard to be not essential and any object with straight boundary can be an effective target.The calculation of the 2-D homography matrix consists of a linear least-squares solution of a homogeneous system followed by a nonlinear minimization of the geometric error in the image plane.Since the measurement quality impacts on the accuracy of the result,we investigate the equivalent constraint and show that placing the calibration target nearby the 2-D LIDAR will provide sufficient constraints to calculate the 2-D homography matrix.Simulation and experimental results validate that the proposed algorithm is robust and accurate.Compared with the previous works,which require two calibration processes and special calibration targets such as checkerboard,our method is more flexible and easier to perform.展开更多
This article presents a passive navigation method of terrain contour matching by reconstructing the 3-D terrain from the image sequence(acquired by the onboard camera).To achieve automation and simultaneity of the ima...This article presents a passive navigation method of terrain contour matching by reconstructing the 3-D terrain from the image sequence(acquired by the onboard camera).To achieve automation and simultaneity of the image sequence processing for navigation,a correspondence registration method based on control points tracking is proposed which tracks the sparse control points through the whole image sequence and uses them as correspondence in the relation geometry solution.Besides,a key frame selection method based on the images overlapping ratio and intersecting angles is explored,thereafter the requirement for the camera system configuration is provided.The proposed method also includes an optimal local homography estimating algorithm according to the control points,which helps correctly predict points to be matched and their speed corresponding.Consequently,the real-time 3-D terrain of the trajectory thus reconstructed is matched with the referenced terrain map,and the result of which provides navigating information.The digital simulation experiment and the real image based experiment have verified the proposed method.展开更多
Through the supply chain,the quality or quality change of the products can generate important losses.The quality control in some steps is made manually that supposes a high level of subjectivity,controlling the qualit...Through the supply chain,the quality or quality change of the products can generate important losses.The quality control in some steps is made manually that supposes a high level of subjectivity,controlling the quality and its evolution using automatic systems can suppose a reduction of the losses.Testing some automatic image analysis techniques in the case of tomatoes and zucchini is the main objective of this study.Two steps in the supply chain are considered,the feeding of the raw products into the handling chain(because low quality generates a reduction of the chain productivity)and the cool storage of the processed products(as the value at the market is reduced).It was proposed to analyze the incoming products at the head the processing line using CCD cameras to detect low quality and/or dirty products(corresponding to specific farmers/suppliers,it should be asked to improve to maintain the productivity of the line).The second stage is analyzing the evolution of the products along the cool chain(storage and transport),the use of an App developed to be use under Android was proposed to substitute the“visual”evaluation used in practice.The algorithms used,including stages of pre-treatment,segmentation,analysis and presentation of the results take account of the short time available and the limited capacity of the batteries.High performance techniques were applied to the homography stage to discard some of the images,resulting in better performance.Also threads and renderscript kernels were created to parallelize the methods used on the resulting images being able to inspect faster the products.The proposed method achieves success rates comparable to,and improving,the expert inspection.展开更多
This paper presents a novel deep neural network for designated point tracking(DPT)in a monocular RGB video,VideoInNet.More concretely,the aim is to track four designated points correlated by a local homography on a te...This paper presents a novel deep neural network for designated point tracking(DPT)in a monocular RGB video,VideoInNet.More concretely,the aim is to track four designated points correlated by a local homography on a textureless planar region in the scene.DPT can be applied to augmented reality and video editing,especially in the field of video advertising.Existing methods predict the location of four designated points without appropriately considering the point correlation.To solve this problem,VideoInNet predicts the motion of the four designated points correlated by a local homography within the heatmap prediction framework.Our network refines the heatmaps of designated points through two stages.On the first stage,we introduce a context-aware and location-aware structure to learn a local homography for the designated plane in a supervised way.On the second stage,we introduce an iterative heatmap refinement module to improve the tracking accuracy.We propose a dataset focusing on textureless planar regions,named ScanDPT,for training and evaluation.We show that the error rate of VideoInNet is about 29%lower than that of the state-of-the-art approach when testing in the first 120 frames of testing videos on ScanDPT.展开更多
基金Science and Technology Research Project of the Henan Province(222102240014).
文摘Traditional feature-based image stitching techniques often encounter obstacles when dealing with images lackingunique attributes or suffering from quality degradation. The scarcity of annotated datasets in real-life scenesseverely undermines the reliability of supervised learning methods in image stitching. Furthermore, existing deeplearning architectures designed for image stitching are often too bulky to be deployed on mobile and peripheralcomputing devices. To address these challenges, this study proposes a novel unsupervised image stitching methodbased on the YOLOv8 (You Only Look Once version 8) framework that introduces deep homography networksand attentionmechanisms. Themethodology is partitioned into three distinct stages. The initial stage combines theattention mechanism with a pooling pyramid model to enhance the detection and recognition of compact objectsin images, the task of the deep homography networks module is to estimate the global homography of the inputimages consideringmultiple viewpoints. The second stage involves preliminary stitching of the masks generated inthe initial stage and further enhancement through weighted computation to eliminate common stitching artifacts.The final stage is characterized by adaptive reconstruction and careful refinement of the initial stitching results.Comprehensive experiments acrossmultiple datasets are executed tometiculously assess the proposed model. Ourmethod’s Peak Signal-to-Noise Ratio (PSNR) and Structure Similarity Index Measure (SSIM) improved by 10.6%and 6%. These experimental results confirm the efficacy and utility of the presented model in this paper.
基金This project was supported by the National Natural Science Foundation of China (60275042) and"Shuguang"Project ofShanghai Municipal Education Committee
文摘Stereo matching is an important research area in stereovision and stereo matching of curved surface is especially crucial A novel correspondence algorithm is presented and its matching uncertainty is computed robustly for feature points of curved surface. The comers are matched by using homography constraint besides epipolar constraint to solve the occlusion problem. The uncertainty sources are analyzed. A cost function is established and acts as an optimal rule to compute the matching uncertainty. An adaptive scheme Gauss weights are put forward to make the matching results robust to noises. It makes the practical application of comer matching possible. From the experimental results of an image pair of curved surface it is shown that computing uncertainty robustly can restrain the affection caused by noises to the matching precision.
基金the Ph. D. Programs Foundation of Ministry of Education of China (20040248046).
文摘The identification of the correspondences of points of views is an important task. A new feature matching algorithm for weakly calibrated stereo images of curved scenes is proposed, based on mere geometric constraints. After initial correspondences are built via the epipolar constraint, many point-to-point image mappings called homographies are set up to predict the matching position for feature points. To refine the predictions and reject false correspondences, four schemes are proposed. Extensive experiments on simulated data as well as on real images of scenes of variant depths show that the proposed method is effective and robust.
文摘In order to improve the user’s satisfaction with the augmented reality (AR) technology and the accuracy of the service, it is important to obtain the exact position of the user. Frequently used techniques for finding outdoors locations is the global positioning system (GPS), which is less accurate indoors. Therefore, an indoor position is measured by comparing the reception level about access point (AP) signal of wireless fidelity (Wi-Fi) or using bluetooth low energy (BLE) tags. However, Wi-Fi and Bluetooth require additional hardware installation. In this paper, the proposed method of estimating the user’s position uses an indoor image and indoor coordinate map without additional hardware installation. The indoor image has several feature points extracted from fixed objects. By matching the feature points with the feature points of the user image, we can obtain the position of the user on the Indoor map by obtaining six or more pixel coordinates from the user image and solving the solution using the perspective projection formula. The experimental results show that the user position can be obtained more accurately in the indoor environment by using only the software without additional hardware installation.
基金supported in part by the National Key R&D Program of China(2018AAA0102200)the National Natural Science Foundation of China(62002375,62002376,62325221,62132021).
文摘Template matching is a fundamental task in computer vision and has been studied for decades.It plays an essential role in manufacturing industry for estimating the poses of different parts,facilitating downstream tasks such as robotic grasping.Existing methods fail when the template and source images have different modalities,cluttered backgrounds,or weak textures.They also rarely consider geometric transformations via homographies,which commonly exist even for planar industrial parts.To tackle the challenges,we propose an accurate template matching method based on differentiable coarse-tofine correspondence refinement.We use an edge-aware module to overcome the domain gap between the mask template and the grayscale image,allowing robust matching.An initial warp is estimated using coarse correspondences based on novel structure-aware information provided by transformers.This initial alignment is passed to a refinement network using references and aligned images to obtain sub-pixel level correspondences which are used to give the final geometric transformation.Extensive evaluation shows that our method to be significantly better than state-of-the-art methods and baselines,providing good generalization ability and visually plausible results even on unseen real data.
基金the National Natural Science Foundation of China(No.61976091)。
文摘Oral endoscope image stitching algorithm is studied to obtain wide-field oral images through regis-tration and stitching,which is of great significance for auxiliary diagnosis.Compared with natural images,oral images have lower textures and fewer features.However,traditional feature-based image stitching methods rely heavily on feature extraction quality,often showing an unsatisfactory performance when stitching images with few features.Moreover,due to the hand-held shooting,there are large depth and perspective disparities between the captured images,which also pose a challenge to image stitching.To overcome the above problems,we propose an unsupervised oral endoscope image stitching algorithm based on the extraction of overlapping regions and the loss of deep features.In the registration stage,we extract the overlapping region of the input images by sketching polygon intersection for feature points screening and estimate homography from coarse to fine on a three-layer feature pyramid structure.Moreover,we calculate loss using deep features instead of pixel values to emphasize the importance of depth disparities in homography estimation.Finally,we reconstruct the stitched images from feature to pixel,which can eliminate artifacts caused by large parallax.Our method is compared with both feature-based and previous deep-based methods on the UDIS-D dataset and our oral endoscopy image dataset.The experimental results show that our algorithm can achieve higher homography estimation accuracy,and better visual quality,and can be effectively applied to oral endoscope image stitching.
文摘The plane metrology using a single uncalibrated image is studied in the paper, and three novel approaches are proposed. The first approach, namely key-line-based method, is an improvement over the widely used key-point-based method, which uses line correspondences directly to compute homography between the world plane and its image so as to increase the computational accuracy. The second and third approaches are both based on a pair of vanishing points from two orthogonal sets of parallel lines in the space plane together with two unparallel referential distances, but the two methods deal with the problem in different ways. One is from the algebraic viewpoint which first maps the image points to an affine space via a transformation constructed from the vanishing points, and then computes the metric distance according to the relationship between the affine space and the Euclidean space, while the other is from the geometrical viewpoint based on the invariance of cross ratios. The second and third methods avoid the selection of control points and are widely applicable. In addition, a brief description on how to retrieve other geometrical entities on the space plane, such as distance from a point to a line, angle formed by two lines, etc., is also presented in the paper. Extensive experiments on simulated data as well as on real images show that the first and the second approaches are of better precision and stronger robustness than the key-point-based one and the third one, since these two approaches are fundamentally based on line information.
基金A preliminary version of this paper appeared in Proc. Pacific Graphics 2005, Macao. This project is funded by the National Key Basic Research 973 Program of China (Grant No. 2002CB312100), the National Natural Science Foundation of China (Grant No. 60533080) and the Program for New Century Excellent Talents in University of M0E.
文摘In the traditional manifold mosaic, a single center strip is clipped out from each source image to create a large image. Therefore the displacement between neighboring views should be very small in order to fulfill effective strips cutting. In this paper, a method is proposed to create a manifold mosaic by images with relative large displacement by means of cutting out multiple strips in the overlap area according to the homography between images. These strips are then warped together to create a smooth mosaic. An improved RANSAC algorithm is also presented in order to improve the precision of homography calculation. Experimental results demonstrate the efficiency of the method.
基金supported in part by the National Natural Science Foundation of China (Nos. 90820305 and 60775040)the National High-Tech Research and Development (863) Program of China (No. 2012AA041402)
文摘In this paper,we propose a new algorithm to establish the data association between a camera and a 2-D Light Detection And Ranging sensor (LIDAR).In contrast to the previous works,where data association is established by calibrating the intrinsic parameters of the camera and the extrinsic parameters of the camera and the LIDAR,we formulate the map between laser points and pixels as a 2-D homography.The line-point correspondence is employed to construct geometric constraint on the homography matrix.This enables checkerboard to be not essential and any object with straight boundary can be an effective target.The calculation of the 2-D homography matrix consists of a linear least-squares solution of a homogeneous system followed by a nonlinear minimization of the geometric error in the image plane.Since the measurement quality impacts on the accuracy of the result,we investigate the equivalent constraint and show that placing the calibration target nearby the 2-D LIDAR will provide sufficient constraints to calculate the 2-D homography matrix.Simulation and experimental results validate that the proposed algorithm is robust and accurate.Compared with the previous works,which require two calibration processes and special calibration targets such as checkerboard,our method is more flexible and easier to perform.
基金supported by the "Eleventh Five" Obligatory Budget of PLA (Grant No.513150801)
文摘This article presents a passive navigation method of terrain contour matching by reconstructing the 3-D terrain from the image sequence(acquired by the onboard camera).To achieve automation and simultaneity of the image sequence processing for navigation,a correspondence registration method based on control points tracking is proposed which tracks the sparse control points through the whole image sequence and uses them as correspondence in the relation geometry solution.Besides,a key frame selection method based on the images overlapping ratio and intersecting angles is explored,thereafter the requirement for the camera system configuration is provided.The proposed method also includes an optimal local homography estimating algorithm according to the control points,which helps correctly predict points to be matched and their speed corresponding.Consequently,the real-time 3-D terrain of the trajectory thus reconstructed is matched with the referenced terrain map,and the result of which provides navigating information.The digital simulation experiment and the real image based experiment have verified the proposed method.
基金funded by the Controlcrop Project,P10-TEP-6174,project framework,supported by the Andalusian Ministry of Economy,Innovation and Science(Andalusia,Spain)the Spanish Ministry of Science and Innovation as well as the EUERDF funds under grant DPI2014-56364-C2-1-R,by TEAP project included in the Marie Curie Actions(PIRSES-GA-2013-612659)by Young Scientists Fund of National Natural Science Foundation of China(31401683).
文摘Through the supply chain,the quality or quality change of the products can generate important losses.The quality control in some steps is made manually that supposes a high level of subjectivity,controlling the quality and its evolution using automatic systems can suppose a reduction of the losses.Testing some automatic image analysis techniques in the case of tomatoes and zucchini is the main objective of this study.Two steps in the supply chain are considered,the feeding of the raw products into the handling chain(because low quality generates a reduction of the chain productivity)and the cool storage of the processed products(as the value at the market is reduced).It was proposed to analyze the incoming products at the head the processing line using CCD cameras to detect low quality and/or dirty products(corresponding to specific farmers/suppliers,it should be asked to improve to maintain the productivity of the line).The second stage is analyzing the evolution of the products along the cool chain(storage and transport),the use of an App developed to be use under Android was proposed to substitute the“visual”evaluation used in practice.The algorithms used,including stages of pre-treatment,segmentation,analysis and presentation of the results take account of the short time available and the limited capacity of the batteries.High performance techniques were applied to the homography stage to discard some of the images,resulting in better performance.Also threads and renderscript kernels were created to parallelize the methods used on the resulting images being able to inspect faster the products.The proposed method achieves success rates comparable to,and improving,the expert inspection.
基金the Key Research Projects of the Foundation Strengthening Program under Grant No.2020JCJQZD01412the National Natural Science Foundation of China under Grant No.61832016.
文摘This paper presents a novel deep neural network for designated point tracking(DPT)in a monocular RGB video,VideoInNet.More concretely,the aim is to track four designated points correlated by a local homography on a textureless planar region in the scene.DPT can be applied to augmented reality and video editing,especially in the field of video advertising.Existing methods predict the location of four designated points without appropriately considering the point correlation.To solve this problem,VideoInNet predicts the motion of the four designated points correlated by a local homography within the heatmap prediction framework.Our network refines the heatmaps of designated points through two stages.On the first stage,we introduce a context-aware and location-aware structure to learn a local homography for the designated plane in a supervised way.On the second stage,we introduce an iterative heatmap refinement module to improve the tracking accuracy.We propose a dataset focusing on textureless planar regions,named ScanDPT,for training and evaluation.We show that the error rate of VideoInNet is about 29%lower than that of the state-of-the-art approach when testing in the first 120 frames of testing videos on ScanDPT.