A new stereo matching scheme from image pairs based on graph cuts is given,which can solve the problem of large color differences as the result of fusing matching results of graph cuts from different color spaces.This...A new stereo matching scheme from image pairs based on graph cuts is given,which can solve the problem of large color differences as the result of fusing matching results of graph cuts from different color spaces.This scheme builds normalized histogram and reference histogram from matching results,and uses clustering algorithm to process the two histograms.Region histogram statistical method is adopted to retrieve depth data to achieve final matching results.Regular stereo matching library is used to verify this scheme,and experiments reported in this paper support availability of this method for automatic image processing.This scheme renounces the step of manual selection for adaptive color space and can obtain stable matching results.The whole procedure can be executed automatically and improve the integration level of image analysis process.展开更多
Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier ...Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier filtering problem from two aspects. First, a robust and efficient graph interaction model,is proposed, with the assumption that matches are correlated with each other rather than independently distributed. To this end, we construct a graph based on the local relationships of matches and formulate the outlier filtering task as a binary labeling energy minimization problem, where the pairwise term encodes the interaction between matches. We further show that this formulation can be solved globally by graph cut algorithm. Our new formulation always improves the performance of previous localitybased method without noticeable deterioration in processing time,adding a few milliseconds. Second, to construct a better graph structure, a robust and geometrically meaningful topology-aware relationship is developed to capture the topology relationship between matches. The two components in sum lead to topology interaction matching(TIM), an effective and efficient method for outlier filtering. Extensive experiments on several large and diverse datasets for multiple vision tasks including general feature matching, as well as relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multi-modal image matching, demonstrate that our TIM is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code is publicly available at http://github.com/YifanLu2000/TIM.展开更多
Skin segmentation is widely used in many computer vision tasks to improve automated visualiza- tion. This paper presents a graph cuts algorithm to segment arbitrary skin regions from images. The detected face is used ...Skin segmentation is widely used in many computer vision tasks to improve automated visualiza- tion. This paper presents a graph cuts algorithm to segment arbitrary skin regions from images. The detected face is used to determine the foreground skin seeds and the background non-skin seeds with the color probability distributions for the foreground represented by a single Gaussian model and for the background by a Gaussian mixture model. The probability distribution of the image is used for noise suppression to alle- viate the influence of the background regions having skin-like colors. Finally, the skin is segmented by graph cuts, with the regional parameter y optimally selected to adapt to different images. Tests of the algorithm on many real world photographs show that the scheme accurately segments skin regions and is robust against illumination variations, individual skin variations, and cluttered backgrounds.展开更多
Segmentation of three-dimensional(3D) complicated structures is of great importance for many real applications.In this work we combine graph cut minimization method with a variant of the level set idea for 3D segmenta...Segmentation of three-dimensional(3D) complicated structures is of great importance for many real applications.In this work we combine graph cut minimization method with a variant of the level set idea for 3D segmentation based on the Mumford-Shah model.Compared with the traditional approach for solving the Euler-Lagrange equation we do not need to solve any partial differential equations.Instead,the minimum cut on a special designed graph need to be computed.The method is tested on data with complicated structures.It is rather stable with respect to initial value and the algorithm is nearly parameter free.Experiments show that it can solve large problems much faster than traditional approaches.展开更多
Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is...Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is the estimation of transmittance,which is the key element of haze-affected imaging models.Conventional methods are based on a set of assumptions that reduce the solution search space.However,the multiplication of these assumptions tends to restrict the solutions to particular cases that cannot account for the reality of the observed image.In this paper we reduce the number of simplified hypotheses in order to attain a more plausible and realistic solution by exploiting a priori knowledge of the ground truth in the proposed method.The proposed method relies on pixel information between the ground truth and haze image to reduce these assumptions.This is achieved by using ground truth and haze image to find the geometric-pixel information through a guided Convolution Neural Networks(CNNs)with a Parallax Attention Mechanism(PAM).It uses the differential pixel-based variance in order to estimate transmittance.The pixel variance uses local and global patches between the assumed ground truth and haze image to refine the transmission map.The transmission map is also improved based on improved Markov random field(MRF)energy functions.We used different images to test the proposed algorithm.The entropy value of the proposed method was 7.43 and 7.39,a percent increase of4.35%and5.42%,respectively,compared to the best existing results.The increment is similar in other performance quality metrics and this validate its superiority compared to other existing methods in terms of key image quality evaluation metrics.The proposed approach’s drawback,an over-reliance on real ground truth images,is also investigated.The proposed method show more details hence yields better images than those from the existing state-of-the-art-methods.展开更多
Context-aware facial recognition regards the recognition of faces in association with their respective environments.This concept is useful for the domestic robot which interacts with humans when performing specific fu...Context-aware facial recognition regards the recognition of faces in association with their respective environments.This concept is useful for the domestic robot which interacts with humans when performing specific functions in indoor environments.Deep learning models have been relevant in solving facial and place recognition challenges;however,they require the procurement of training images for optimal performance.Pre-trained models have also been offered to reduce training time significantly.Regardless,for classification tasks,custom data must be acquired to ensure that learning models are developed from other pre-trained models.This paper proposes a place recognition model that is inspired by the graph cut energy function,which is specifically designed for image segmentation.Common objects in the considered environment are identified and thereafter they are passed over to a graph cut inspired model for indoor environment classification.Additionally,faces in the considered environment are extracted and recognised.Finally,the developed model can recognise a face together with its environment.The strength of the proposed model lies in its ability to classify indoor environments without the usual training process(es).This approach differs from what is obtained in traditional deep learning models.The classification capability of the developed model was compared to state-of-theart models and exhibited promising outcomes.展开更多
This paper presents some techniques for synthesizing novel view for a virtual viewpoint from two given views cap-tured at different viewpoints to achieve both high quality and high efficiency. The whole process consis...This paper presents some techniques for synthesizing novel view for a virtual viewpoint from two given views cap-tured at different viewpoints to achieve both high quality and high efficiency. The whole process consists of three passes. The first pass recovers depth map. We formulate it as pixel labelling and propose a bisection approach to solve it. It is accomplished in log2n(n is the number of depth levels) steps,each of which involves a single graph cut computation. The second pass detects occluded pixels and reasons about their depth. It fits a foreground depth curve and a background depth curve using depth of nearby fore-ground and background pixels,and then distinguishes foreground and background pixels by minimizing a global energy,which involves only one graph cut computation. The third pass finds for each pixel in the novel view the corresponding pixels in the input views and computes its color. The whole process involves only a small number of graph cut computations,therefore it is efficient. And,visual artifacts in the synthesized view can be removed successfully by correcting depth of the occluded pixels. Experimental results demonstrate that both high quality and high efficiency are achieved by the proposed techniques.展开更多
Recently,computer vision(CV)based disease diagnosis models have been utilized in various areas of healthcare.At the same time,deep learning(DL)and machine learning(ML)models play a vital role in the healthcare sector ...Recently,computer vision(CV)based disease diagnosis models have been utilized in various areas of healthcare.At the same time,deep learning(DL)and machine learning(ML)models play a vital role in the healthcare sector for the effectual recognition of diseases using medical imaging tools.This study develops a novel computer vision with optimal machine learning enabled skin lesion detection and classification(CVOML-SLDC)model.The goal of the CVOML-SLDC model is to determine the appropriate class labels for the test dermoscopic images.Primarily,the CVOML-SLDC model derives a gaussian filtering(GF)approach to pre-process the input images and graph cut segmentation is applied.Besides,firefly algorithm(FFA)with EfficientNet based feature extraction module is applied for effectual derivation of feature vectors.Moreover,naïve bayes(NB)classifier is utilized for the skin lesion detection and classification model.The application of FFA helps to effectually adjust the hyperparameter values of the EfficientNet model.The experimental analysis of the CVOML-SLDC model is performed using benchmark skin lesion dataset.The detailed comparative study of the CVOML-SLDC model reported the improved outcomes over the recent approaches with maximum accuracy of 94.83%.展开更多
Background:Optical coherence tomography(OCT)is a non-invasive imaging system that can be used to obtain images of the anterior segment.Automatic segmentation of these images will enable them to be used to construct pa...Background:Optical coherence tomography(OCT)is a non-invasive imaging system that can be used to obtain images of the anterior segment.Automatic segmentation of these images will enable them to be used to construct patient specific biomechanical models of the human eye.These models could be used to help with treatment planning and diagnosis of patients.Methods:A novel graph cut technique using regional and shape terms was developed.It was evaluated by segmenting 39 OCT images of the anterior segment.The results of this were compared with manual segmentation and a previously reported level set segmentation technique.Three different comparison techniques were used:Dice’s similarity coefficient(DSC),mean unsigned surface positioning error(MSPE),and 95%Hausdorff distance(HD).A paired t-test was used to compare the results of different segmentation techniques.Results:When comparison with manual segmentation was performed,a mean DSC value of 0.943±0.020 was achieved,outperforming other previously published techniques.A substantial reduction in processing time was also achieved using this method.Conclusions:We have developed a new segmentation technique that is both fast and accurate.This has the potential to be used to aid diagnostics and treatment planning.展开更多
Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in...Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.展开更多
This is primarily an expository paper surveying up-to-date known results on the spectral theory of1-Laplacian on graphs and its applications to the Cheeger cut, maxcut and multi-cut problems. The structure of eigenspa...This is primarily an expository paper surveying up-to-date known results on the spectral theory of1-Laplacian on graphs and its applications to the Cheeger cut, maxcut and multi-cut problems. The structure of eigenspace, nodal domains, multiplicities of eigenvalues, and algorithms for graph cuts are collected.展开更多
Video text detection is a challenging problem, since video image background is generally complex and its subtitles often have the problems of color bleeding, fuzzy boundaries and low contrast due to video lossy compre...Video text detection is a challenging problem, since video image background is generally complex and its subtitles often have the problems of color bleeding, fuzzy boundaries and low contrast due to video lossy compression and low resolution. In this paper, we propose a robust framework to solve these problems. Firstly, we exploit gradient amplitude map (GAM) to enhance the edge of an input image, which can overcome the problems of color bleeding and fuzzy boundaries. Secondly, a two-direction morphological filtering is developed to filter background noise and enhance the contrast between background and text. Thirdly, maximally stable extremal region (MSER) is applied to detect text regions with two extreme colors, and we use the mean intensity of the regions as the graph cuts' label set, and the Euclidean distance of three channels in HSI color space as the graph cuts smooth term, to get optimal segmentations. Finally, we group them into text lines using the geometric characteristics of the text, and then corner detection, multi-frame verification, and some heuristic rules are used to eliminate non-text regions. We test our scheme with some challenging videos, and the results prove that our text detection framework is more robust than previous methods.展开更多
Three-dimensional (3-D) video applications, such as 3-D cinema, 3DTV, ana Free Viewpomt Video (FVV) are attracting more attention both from the industry and in the literature. High accuracy of depth video is a fun...Three-dimensional (3-D) video applications, such as 3-D cinema, 3DTV, ana Free Viewpomt Video (FVV) are attracting more attention both from the industry and in the literature. High accuracy of depth video is a fundamental prerequisite for most 3-D applications. However, accurate depth requires computationally intensive global optimization. This high computational complexity is one of the bottlenecks to applying depth generation to 3-D applications, especially for mobile networks since mobile terminals usually have limited computing ability. This paper presents a semi-global depth estimation algorithm based on temporal consis- tency, where the depth propagation is used to generate initial depth values for the computationally intensive global optimization. The accuracy of initial depth is improved by detecting and eliminating the depth propagation outliers before the global optimization. Integrating the initial values without outliers into the global optimization reduces the computational complexity while maintaining the depth accuracy. Tests demonstrate that the algorithm reduces the total computational time by 54%-65% while the quality of the virtual views is essentially equivalent to the benchmark.展开更多
基金Sponsored by"985"Second Procession Construction of Ministry of Education(3040012040101)
文摘A new stereo matching scheme from image pairs based on graph cuts is given,which can solve the problem of large color differences as the result of fusing matching results of graph cuts from different color spaces.This scheme builds normalized histogram and reference histogram from matching results,and uses clustering algorithm to process the two histograms.Region histogram statistical method is adopted to retrieve depth data to achieve final matching results.Regular stereo matching library is used to verify this scheme,and experiments reported in this paper support availability of this method for automatic image processing.This scheme renounces the step of manual selection for adaptive color space and can obtain stable matching results.The whole procedure can be executed automatically and improve the integration level of image analysis process.
基金supported by the National Natural Science Foundation of China (62276192)。
文摘Feature matching plays a key role in computer vision. However, due to the limitations of the descriptors, the putative matches are inevitably contaminated by massive outliers.This paper attempts to tackle the outlier filtering problem from two aspects. First, a robust and efficient graph interaction model,is proposed, with the assumption that matches are correlated with each other rather than independently distributed. To this end, we construct a graph based on the local relationships of matches and formulate the outlier filtering task as a binary labeling energy minimization problem, where the pairwise term encodes the interaction between matches. We further show that this formulation can be solved globally by graph cut algorithm. Our new formulation always improves the performance of previous localitybased method without noticeable deterioration in processing time,adding a few milliseconds. Second, to construct a better graph structure, a robust and geometrically meaningful topology-aware relationship is developed to capture the topology relationship between matches. The two components in sum lead to topology interaction matching(TIM), an effective and efficient method for outlier filtering. Extensive experiments on several large and diverse datasets for multiple vision tasks including general feature matching, as well as relative pose estimation, homography and fundamental matrix estimation, loop-closure detection, and multi-modal image matching, demonstrate that our TIM is more competitive than current state-of-the-art methods, in terms of generality, efficiency, and effectiveness. The source code is publicly available at http://github.com/YifanLu2000/TIM.
基金Supported by the National Natural Science Foundation of China (No. 60472028)the Specialized Research Fund for the Doctoral Program of Higher Education of MOE, China (No. 20040003015)
文摘Skin segmentation is widely used in many computer vision tasks to improve automated visualiza- tion. This paper presents a graph cuts algorithm to segment arbitrary skin regions from images. The detected face is used to determine the foreground skin seeds and the background non-skin seeds with the color probability distributions for the foreground represented by a single Gaussian model and for the background by a Gaussian mixture model. The probability distribution of the image is used for noise suppression to alle- viate the influence of the background regions having skin-like colors. Finally, the skin is segmented by graph cuts, with the regional parameter y optimally selected to adapt to different images. Tests of the algorithm on many real world photographs show that the scheme accurately segments skin regions and is robust against illumination variations, individual skin variations, and cluttered backgrounds.
基金support from the Centre for Integrated Petroleum Research(CIPR),University of Bergen, Norway,and Singapore MOE Grant T207B2202NRF2007IDMIDM002-010
文摘Segmentation of three-dimensional(3D) complicated structures is of great importance for many real applications.In this work we combine graph cut minimization method with a variant of the level set idea for 3D segmentation based on the Mumford-Shah model.Compared with the traditional approach for solving the Euler-Lagrange equation we do not need to solve any partial differential equations.Instead,the minimum cut on a special designed graph need to be computed.The method is tested on data with complicated structures.It is rather stable with respect to initial value and the algorithm is nearly parameter free.Experiments show that it can solve large problems much faster than traditional approaches.
基金This work was funded by the Deanship of Scientific Research at Jouf University under grant No DSR-2021-02-0398.
文摘Image dehazing is still an open research topic that has been undergoing a lot of development,especially with the renewed interest in machine learning-based methods.A major challenge of the existing dehazing methods is the estimation of transmittance,which is the key element of haze-affected imaging models.Conventional methods are based on a set of assumptions that reduce the solution search space.However,the multiplication of these assumptions tends to restrict the solutions to particular cases that cannot account for the reality of the observed image.In this paper we reduce the number of simplified hypotheses in order to attain a more plausible and realistic solution by exploiting a priori knowledge of the ground truth in the proposed method.The proposed method relies on pixel information between the ground truth and haze image to reduce these assumptions.This is achieved by using ground truth and haze image to find the geometric-pixel information through a guided Convolution Neural Networks(CNNs)with a Parallax Attention Mechanism(PAM).It uses the differential pixel-based variance in order to estimate transmittance.The pixel variance uses local and global patches between the assumed ground truth and haze image to refine the transmission map.The transmission map is also improved based on improved Markov random field(MRF)energy functions.We used different images to test the proposed algorithm.The entropy value of the proposed method was 7.43 and 7.39,a percent increase of4.35%and5.42%,respectively,compared to the best existing results.The increment is similar in other performance quality metrics and this validate its superiority compared to other existing methods in terms of key image quality evaluation metrics.The proposed approach’s drawback,an over-reliance on real ground truth images,is also investigated.The proposed method show more details hence yields better images than those from the existing state-of-the-art-methods.
文摘Context-aware facial recognition regards the recognition of faces in association with their respective environments.This concept is useful for the domestic robot which interacts with humans when performing specific functions in indoor environments.Deep learning models have been relevant in solving facial and place recognition challenges;however,they require the procurement of training images for optimal performance.Pre-trained models have also been offered to reduce training time significantly.Regardless,for classification tasks,custom data must be acquired to ensure that learning models are developed from other pre-trained models.This paper proposes a place recognition model that is inspired by the graph cut energy function,which is specifically designed for image segmentation.Common objects in the considered environment are identified and thereafter they are passed over to a graph cut inspired model for indoor environment classification.Additionally,faces in the considered environment are extracted and recognised.Finally,the developed model can recognise a face together with its environment.The strength of the proposed model lies in its ability to classify indoor environments without the usual training process(es).This approach differs from what is obtained in traditional deep learning models.The classification capability of the developed model was compared to state-of-theart models and exhibited promising outcomes.
基金Project (No. 2002CB312101) supported by the National Basic Re-search Program (973) of China
文摘This paper presents some techniques for synthesizing novel view for a virtual viewpoint from two given views cap-tured at different viewpoints to achieve both high quality and high efficiency. The whole process consists of three passes. The first pass recovers depth map. We formulate it as pixel labelling and propose a bisection approach to solve it. It is accomplished in log2n(n is the number of depth levels) steps,each of which involves a single graph cut computation. The second pass detects occluded pixels and reasons about their depth. It fits a foreground depth curve and a background depth curve using depth of nearby fore-ground and background pixels,and then distinguishes foreground and background pixels by minimizing a global energy,which involves only one graph cut computation. The third pass finds for each pixel in the novel view the corresponding pixels in the input views and computes its color. The whole process involves only a small number of graph cut computations,therefore it is efficient. And,visual artifacts in the synthesized view can be removed successfully by correcting depth of the occluded pixels. Experimental results demonstrate that both high quality and high efficiency are achieved by the proposed techniques.
文摘Recently,computer vision(CV)based disease diagnosis models have been utilized in various areas of healthcare.At the same time,deep learning(DL)and machine learning(ML)models play a vital role in the healthcare sector for the effectual recognition of diseases using medical imaging tools.This study develops a novel computer vision with optimal machine learning enabled skin lesion detection and classification(CVOML-SLDC)model.The goal of the CVOML-SLDC model is to determine the appropriate class labels for the test dermoscopic images.Primarily,the CVOML-SLDC model derives a gaussian filtering(GF)approach to pre-process the input images and graph cut segmentation is applied.Besides,firefly algorithm(FFA)with EfficientNet based feature extraction module is applied for effectual derivation of feature vectors.Moreover,naïve bayes(NB)classifier is utilized for the skin lesion detection and classification model.The application of FFA helps to effectually adjust the hyperparameter values of the EfficientNet model.The experimental analysis of the CVOML-SLDC model is performed using benchmark skin lesion dataset.The detailed comparative study of the CVOML-SLDC model reported the improved outcomes over the recent approaches with maximum accuracy of 94.83%.
文摘Background:Optical coherence tomography(OCT)is a non-invasive imaging system that can be used to obtain images of the anterior segment.Automatic segmentation of these images will enable them to be used to construct patient specific biomechanical models of the human eye.These models could be used to help with treatment planning and diagnosis of patients.Methods:A novel graph cut technique using regional and shape terms was developed.It was evaluated by segmenting 39 OCT images of the anterior segment.The results of this were compared with manual segmentation and a previously reported level set segmentation technique.Three different comparison techniques were used:Dice’s similarity coefficient(DSC),mean unsigned surface positioning error(MSPE),and 95%Hausdorff distance(HD).A paired t-test was used to compare the results of different segmentation techniques.Results:When comparison with manual segmentation was performed,a mean DSC value of 0.943±0.020 was achieved,outperforming other previously published techniques.A substantial reduction in processing time was also achieved using this method.Conclusions:We have developed a new segmentation technique that is both fast and accurate.This has the potential to be used to aid diagnostics and treatment planning.
基金supported by National Natural Science Foundation of China (Grant No. 10771053)Specialized Research Foundation for the Doctoral Program of Higher Education of China (SRFDP) (Grant No. 20060512001)Natural Science Foundation of Hubei Province (Grant No. 2007ABA139)
文摘Semi-supervised learning has been of growing interest over the past few years and many methods have been proposed. Although various algorithms are provided to implement semi-supervised learning,there are still gaps in our understanding of the dependence of generalization error on the numbers of labeled and unlabeled data. In this paper,we consider a graph-based semi-supervised classification algorithm and establish its generalization error bounds. Our results show the close relations between the generalization performance and the structural invariants of data graph.
基金supported by National Natural Science Foundation of China (Grant Nos. 11371038, 11471025, 11421101 and 61121002)
文摘This is primarily an expository paper surveying up-to-date known results on the spectral theory of1-Laplacian on graphs and its applications to the Cheeger cut, maxcut and multi-cut problems. The structure of eigenspace, nodal domains, multiplicities of eigenvalues, and algorithms for graph cuts are collected.
文摘Video text detection is a challenging problem, since video image background is generally complex and its subtitles often have the problems of color bleeding, fuzzy boundaries and low contrast due to video lossy compression and low resolution. In this paper, we propose a robust framework to solve these problems. Firstly, we exploit gradient amplitude map (GAM) to enhance the edge of an input image, which can overcome the problems of color bleeding and fuzzy boundaries. Secondly, a two-direction morphological filtering is developed to filter background noise and enhance the contrast between background and text. Thirdly, maximally stable extremal region (MSER) is applied to detect text regions with two extreme colors, and we use the mean intensity of the regions as the graph cuts' label set, and the Euclidean distance of three channels in HSI color space as the graph cuts smooth term, to get optimal segmentations. Finally, we group them into text lines using the geometric characteristics of the text, and then corner detection, multi-frame verification, and some heuristic rules are used to eliminate non-text regions. We test our scheme with some challenging videos, and the results prove that our text detection framework is more robust than previous methods.
基金Supported by the National Key Basic Research and Development (973) Program of China (No.2010CB731800)
文摘Three-dimensional (3-D) video applications, such as 3-D cinema, 3DTV, ana Free Viewpomt Video (FVV) are attracting more attention both from the industry and in the literature. High accuracy of depth video is a fundamental prerequisite for most 3-D applications. However, accurate depth requires computationally intensive global optimization. This high computational complexity is one of the bottlenecks to applying depth generation to 3-D applications, especially for mobile networks since mobile terminals usually have limited computing ability. This paper presents a semi-global depth estimation algorithm based on temporal consis- tency, where the depth propagation is used to generate initial depth values for the computationally intensive global optimization. The accuracy of initial depth is improved by detecting and eliminating the depth propagation outliers before the global optimization. Integrating the initial values without outliers into the global optimization reduces the computational complexity while maintaining the depth accuracy. Tests demonstrate that the algorithm reduces the total computational time by 54%-65% while the quality of the virtual views is essentially equivalent to the benchmark.