We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and ...We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.展开更多
Medical diagnosis software and computer-assisted surgical systems often use segmented image data to help clinicians make decisions. The segmentation extracts the region of interest from the background, which makes the...Medical diagnosis software and computer-assisted surgical systems often use segmented image data to help clinicians make decisions. The segmentation extracts the region of interest from the background, which makes the visualization clearer. However, no segmentation method can guarantee accurate results under all circumstances. As a result, the clinicians need a solution that enables them to check and validate the segmentation accuracy as well as displaying the segmented area without ambiguities. With the method presented in this paper, the real CT or MR image is displayed within the segmented region and the segmented boundaries can be expanded or contracted interactively. By this way, the clinicians are able to check and validate the segmentation visually and make more reliable decisions. After experiments with real data from a hospital, the presented method is proved to be suitable for efficiently detecting segmentation errors. The new algorithm uses new graphic processing uint (GPU) shading functions recently introduced in graphic cards and is fast enough to interact oil the segmented area, which was not possible with previous methods.展开更多
Rice blast is regarded as one of the major diseases of rice.Screening rice genotypes with high resistance to rice blast is a key strategy for ensuring global food security.Unmanned aerial vehicles(UAV)-based imaging,c...Rice blast is regarded as one of the major diseases of rice.Screening rice genotypes with high resistance to rice blast is a key strategy for ensuring global food security.Unmanned aerial vehicles(UAV)-based imaging,coupled with deep learning,can acquire high-throughput imagery related to rice blast infection.In this study,we developed a segmented detection model(called RiceblastSegMask)for rice blast detection and resistance evaluation.The feasibility of different backbones and target detection models was further investigated.RiceblastSegMask is a two-stage instance segmentation model,comprising an image-denoising backbone network,a feature pyramid,a trinomial tree fine-grained feature extraction combination network,and an image pixel codec module.The results showed that the model combining the image-denoising and fine-grained feature extraction based on the Swin Transformer and the feature pixel matching feature labels with the trinomial tree recursive algorithm performed the best.The overall accuracy for instance segmentation of RiceblastSegMask reached 97.56%,and it demonstrated a satisfactory accuracy of 90.29%for grading unique resistance to rice blast.These results indicated that low-altitude remote sensing using UAV,in conjunction with the proposed RiceblastSegMask model,can efficiently calculate the extent of rice blast infection,offering a new phenotypic tool for evaluating rice blast resistance on a field scale in rice breeding programs.展开更多
Undeniably,Deep Learning(DL)has rapidly eroded traditional machine learning in Remote Sensing(RS)and geoscience domains with applications such as scene understanding,material identification,extreme weather detection,o...Undeniably,Deep Learning(DL)has rapidly eroded traditional machine learning in Remote Sensing(RS)and geoscience domains with applications such as scene understanding,material identification,extreme weather detection,oil spill identification,among many others.Traditional machine learning algorithms are given less and less attention in the era of big data.Recently,a substantial amount of work aimed at developing image classification approaches based on the DL model’s success in computer vision.The number of relevant articles has nearly doubled every year since 2015.Advances in remote sensing technology,as well as the rapidly expanding volume of publicly available satellite imagery on a worldwide scale,have opened up the possibilities for a wide range of modern applications.However,there are some challenges related to the availability of annotated data,the complex nature of data,and model parameterization,which strongly impact performance.In this article,a comprehensive review of the literature encompassing a broad spectrum of pioneer work in remote sensing image classification is presented including network architectures(vintage Convolutional Neural Network,CNN;Fully Convolutional Networks,FCN;encoder-decoder,recurrent networks;attention models,and generative adversarial models).The characteristics,capabilities,and limitations of current DL models were examined,and potential research directions were discussed.展开更多
We address the problem of 3D human pose estimation in a single real scene image. Normally, 3D pose estimation from real image needs background subtraction to extract the appropriate features. We do not make such assum...We address the problem of 3D human pose estimation in a single real scene image. Normally, 3D pose estimation from real image needs background subtraction to extract the appropriate features. We do not make such assumption, In this paper, a two-step approach is proposed, first, instead of applying background subtraction to get the segmentation of human, we combine the segmentation with human detection using an ISM-based detector. Then, silhouette feature can be extracted and 3D pose estimation is solved as a regression problem. RVMs and ridge regression method are applied to solve this problem. The results show the robustness and accuracy of our method.展开更多
Detecting and tracking multiple targets simultaneously for space-based surveillance requires multiple cameras,which leads to a large system volume and weight. To address this problem, we propose a wide-field detection...Detecting and tracking multiple targets simultaneously for space-based surveillance requires multiple cameras,which leads to a large system volume and weight. To address this problem, we propose a wide-field detection and tracking system using the segmented planar imaging detector for electro-optical reconnaissance. This study realizes two operating modes by changing the working paired lenslets and corresponding waveguide arrays: a detection mode and a tracking mode. A model system was simulated and evaluated using the peak signal-to-noise ratio method. The simulation results indicate that the detection and tracking system can realize wide-field detection and narrow-field, multi-target, high-resolution tracking without moving parts.展开更多
文摘We present a robust connected-component (CC) based method for automatic detection and segmentation of text in real-scene images. This technique can be applied in robot vision, sign recognition, meeting processing and video indexing. First, a Non-Linear Niblack method (NLNiblack) is proposed to decompose the image into candidate CCs. Then, all these CCs are fed into a cascade of classifiers trained by Adaboost algorithm. Each classifier in the cascade responds to one feature of the CC. Proposed here are 12 novel features which are insensitive to noise, scale, text orientation and text language. The classifier cascade allows non-text CCs of the image to be rapidly discarded while more computation is spent on promising text-like CCs. The CCs passing through the cascade are considered as text components and are used to form the segmentation result. A prototype system was built, with experimental results proving the effectiveness and efficiency of the proposed method.
基金Project supported by the National Natural Science Foundation of China (Grant No.60572154), and the National Basic Research Program of China (Grant No.2003CB716104)Acknowledgment I would like to thank YANG Xin, my tutor, SHANG Yan- feng, SUN Kun of Shanghai Children's Medical Center, and all the people in 3D Visualization Laboratory of Shanghai Jiaotong University for their help during my research.
文摘Medical diagnosis software and computer-assisted surgical systems often use segmented image data to help clinicians make decisions. The segmentation extracts the region of interest from the background, which makes the visualization clearer. However, no segmentation method can guarantee accurate results under all circumstances. As a result, the clinicians need a solution that enables them to check and validate the segmentation accuracy as well as displaying the segmented area without ambiguities. With the method presented in this paper, the real CT or MR image is displayed within the segmented region and the segmented boundaries can be expanded or contracted interactively. By this way, the clinicians are able to check and validate the segmentation visually and make more reliable decisions. After experiments with real data from a hospital, the presented method is proved to be suitable for efficiently detecting segmentation errors. The new algorithm uses new graphic processing uint (GPU) shading functions recently introduced in graphic cards and is fast enough to interact oil the segmented area, which was not possible with previous methods.
基金supported by the Natural Science Foundation of Fujian Province,China(Grant No.2022J01611).
文摘Rice blast is regarded as one of the major diseases of rice.Screening rice genotypes with high resistance to rice blast is a key strategy for ensuring global food security.Unmanned aerial vehicles(UAV)-based imaging,coupled with deep learning,can acquire high-throughput imagery related to rice blast infection.In this study,we developed a segmented detection model(called RiceblastSegMask)for rice blast detection and resistance evaluation.The feasibility of different backbones and target detection models was further investigated.RiceblastSegMask is a two-stage instance segmentation model,comprising an image-denoising backbone network,a feature pyramid,a trinomial tree fine-grained feature extraction combination network,and an image pixel codec module.The results showed that the model combining the image-denoising and fine-grained feature extraction based on the Swin Transformer and the feature pixel matching feature labels with the trinomial tree recursive algorithm performed the best.The overall accuracy for instance segmentation of RiceblastSegMask reached 97.56%,and it demonstrated a satisfactory accuracy of 90.29%for grading unique resistance to rice blast.These results indicated that low-altitude remote sensing using UAV,in conjunction with the proposed RiceblastSegMask model,can efficiently calculate the extent of rice blast infection,offering a new phenotypic tool for evaluating rice blast resistance on a field scale in rice breeding programs.
文摘Undeniably,Deep Learning(DL)has rapidly eroded traditional machine learning in Remote Sensing(RS)and geoscience domains with applications such as scene understanding,material identification,extreme weather detection,oil spill identification,among many others.Traditional machine learning algorithms are given less and less attention in the era of big data.Recently,a substantial amount of work aimed at developing image classification approaches based on the DL model’s success in computer vision.The number of relevant articles has nearly doubled every year since 2015.Advances in remote sensing technology,as well as the rapidly expanding volume of publicly available satellite imagery on a worldwide scale,have opened up the possibilities for a wide range of modern applications.However,there are some challenges related to the availability of annotated data,the complex nature of data,and model parameterization,which strongly impact performance.In this article,a comprehensive review of the literature encompassing a broad spectrum of pioneer work in remote sensing image classification is presented including network architectures(vintage Convolutional Neural Network,CNN;Fully Convolutional Networks,FCN;encoder-decoder,recurrent networks;attention models,and generative adversarial models).The characteristics,capabilities,and limitations of current DL models were examined,and potential research directions were discussed.
基金Supported by the National Basic Research Program of China (Grant No.2006CB303103)Key Program of the National Natural Science Foundation of China (Grant No.60833009)
文摘We address the problem of 3D human pose estimation in a single real scene image. Normally, 3D pose estimation from real image needs background subtraction to extract the appropriate features. We do not make such assumption, In this paper, a two-step approach is proposed, first, instead of applying background subtraction to get the segmentation of human, we combine the segmentation with human detection using an ISM-based detector. Then, silhouette feature can be extracted and 3D pose estimation is solved as a regression problem. RVMs and ridge regression method are applied to solve this problem. The results show the robustness and accuracy of our method.
基金supported by the Foundation of Youth Innovation Promotion Association,Chinese Academy of Sciences(No.20150192)
文摘Detecting and tracking multiple targets simultaneously for space-based surveillance requires multiple cameras,which leads to a large system volume and weight. To address this problem, we propose a wide-field detection and tracking system using the segmented planar imaging detector for electro-optical reconnaissance. This study realizes two operating modes by changing the working paired lenslets and corresponding waveguide arrays: a detection mode and a tracking mode. A model system was simulated and evaluated using the peak signal-to-noise ratio method. The simulation results indicate that the detection and tracking system can realize wide-field detection and narrow-field, multi-target, high-resolution tracking without moving parts.