As the pancreas only occupies a small region in the whole abdominal computed tomography(CT)scans and has high variability in shape,location and size,deep neural networks in automatic pancreas segmentation task can be ...As the pancreas only occupies a small region in the whole abdominal computed tomography(CT)scans and has high variability in shape,location and size,deep neural networks in automatic pancreas segmentation task can be easily confused by the complex and variable background.To alleviate these issues,this paper proposes a novel pancreas segmentation optimization based on the coarse-to-fine structure,in which the coarse stage is responsible for increasing the proportion of the target region in the input image through the minimum bounding box,and the fine is for improving the accuracy of pancreas segmentation by enhancing the data diversity and by introducing a new segmentation model,and reducing the running time by adding a total weights constraint.This optimization is evaluated on the public pancreas segmentation dataset and achieves 87.87%average Dice-Sørensen coefficient(DSC)accuracy,which is 0.94%higher than 86.93%,result of the state-of-the-art pancreas segmentation methods.Moreover,this method has strong generalization that it can be easily applied to other coarse-to-fine or one step organ segmentation tasks.展开更多
Coarse-to-fine pyramid and scale space are two important image structures in the realm of image matching.However,the advantage of coarse-to-fine pyramid is neglected as the pyramid structure is usually constructed wit...Coarse-to-fine pyramid and scale space are two important image structures in the realm of image matching.However,the advantage of coarse-to-fine pyramid is neglected as the pyramid structure is usually constructed with the down sampling method in scale space.In addition,the importance of each lattice is different for one single image.Based on the analyses above,the new multi-pyramid(M-P)image spatial structure is constructed.First,coarse-to-fine pyramid is constructed by partitioning the original image into increasingly finer lattices,and the number of interest points is also adopted to be each lattice’s non-normalized weight on each pyramid level.Second,the scale space of each lattice on each pyramid level is generated with the classic Gaussian kernel.Third,the descriptors of each lattice are generated by regarding the stability of scale space as the description of image.Moreover,the parallel version of M-P algorithm is also presented to accelerate the speed of computation.Finally,the comprehensive experimental results reveal that our multi-pyramid structure which is constructed by the combination of coarse-to-fine spatial pyramid and scale space can generate more effective features,compared with the other related methods.展开更多
A minimal generalized time-bandwidth product-based coarse-to-fine strategy is proposed with one novel ideas highlighted: adopting a coarse-to-fine strategy to speed up the searching process. The simulation results on ...A minimal generalized time-bandwidth product-based coarse-to-fine strategy is proposed with one novel ideas highlighted: adopting a coarse-to-fine strategy to speed up the searching process. The simulation results on synthetic and real signals show the validity of the proposed method.展开更多
Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplic...Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design.展开更多
Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondl...Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondly, we utilise shape priors to provide discriminative texture features for convolutional neural networks. These shape and texture features are fused to make the learned representation more robust.Finally, in order to increase efficiency, a coarse-tofine search mechanism is exploited to efficiently find similar objects. Extensive experiments on the CASIAWeb Face, MSRA-CFW, and LFW datasets illustrate the superiority of our method.展开更多
The images from a monocular camera can be processed to detect depth information regarding obstacles in the blind spot area captured by the side-view camera of a vehicle.The depth information is given as a classificati...The images from a monocular camera can be processed to detect depth information regarding obstacles in the blind spot area captured by the side-view camera of a vehicle.The depth information is given as a classification result“near”or“far”when two blocks in the image are compared with respect to their distances and the depth information can be used for the purpose of blind spot area detection.In this paper,the proposed depth information is inferred from a combination of blur cues and texture cues.The depth information is estimated by comparing the features of two image blocks selected within a single image.A preliminary experiment demonstrates that a convolutional neural network(CNN)model trained by deep learning with a set of relatively ideal images achieves good accuracy.The same CNN model is applied to distinguish near and far obstacles according to a specified threshold in the vehicle blind spot area,and the promising results are obtained.The proposed method uses a standard blind spot camera and can improve safety without other additional sensing devices.Thus,the proposed approach has the potential to be applied in vehicular applications for the detection of objects in the driver’s blind spot.展开更多
A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass ...A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass for the task of video analysis, content-based video understanding. In this paper, a novel scheme based on improved two-dimensional entropy is proposed to complete the partition of video shots. Firstly, shot transition candidates are detected using a two-pass algorithm: a coarse searching pass and a fine searching pass. Secondly, with the character of two-dimensional entropy of the image, correctly detected transition candidates are further classified into different transition types whereas those falsely detected shot breaks are distinguished and removed. Finally, the boundary of gradual transition can be precisely located by merging the characters of two-dimensional entropy of the image into the gradual transition. A large number of video sequences are used to test our system performance and promising results are obtained.展开更多
基金supported by the National Natural Science Foundation of China[61772242,61976106,61572239]the China Postdoctoral Science Foundation[2017M611737]+3 种基金the Six Talent Peaks Project in Jiangsu Province[DZXX-122]the Jiangsu Province EmergencyManagement Science and Technology Project[YJGL-TG-2020-8]the Key Research and Development Plan of Zhenjiang City[SH2020011]Postgraduate Innovation Fund of Jiangsu Province[KYCX18_2257].
文摘As the pancreas only occupies a small region in the whole abdominal computed tomography(CT)scans and has high variability in shape,location and size,deep neural networks in automatic pancreas segmentation task can be easily confused by the complex and variable background.To alleviate these issues,this paper proposes a novel pancreas segmentation optimization based on the coarse-to-fine structure,in which the coarse stage is responsible for increasing the proportion of the target region in the input image through the minimum bounding box,and the fine is for improving the accuracy of pancreas segmentation by enhancing the data diversity and by introducing a new segmentation model,and reducing the running time by adding a total weights constraint.This optimization is evaluated on the public pancreas segmentation dataset and achieves 87.87%average Dice-Sørensen coefficient(DSC)accuracy,which is 0.94%higher than 86.93%,result of the state-of-the-art pancreas segmentation methods.Moreover,this method has strong generalization that it can be easily applied to other coarse-to-fine or one step organ segmentation tasks.
文摘Coarse-to-fine pyramid and scale space are two important image structures in the realm of image matching.However,the advantage of coarse-to-fine pyramid is neglected as the pyramid structure is usually constructed with the down sampling method in scale space.In addition,the importance of each lattice is different for one single image.Based on the analyses above,the new multi-pyramid(M-P)image spatial structure is constructed.First,coarse-to-fine pyramid is constructed by partitioning the original image into increasingly finer lattices,and the number of interest points is also adopted to be each lattice’s non-normalized weight on each pyramid level.Second,the scale space of each lattice on each pyramid level is generated with the classic Gaussian kernel.Third,the descriptors of each lattice are generated by regarding the stability of scale space as the description of image.Moreover,the parallel version of M-P algorithm is also presented to accelerate the speed of computation.Finally,the comprehensive experimental results reveal that our multi-pyramid structure which is constructed by the combination of coarse-to-fine spatial pyramid and scale space can generate more effective features,compared with the other related methods.
文摘A minimal generalized time-bandwidth product-based coarse-to-fine strategy is proposed with one novel ideas highlighted: adopting a coarse-to-fine strategy to speed up the searching process. The simulation results on synthetic and real signals show the validity of the proposed method.
文摘Near-duplicate image detection is a necessary operation to refine image search results for efficient user exploration. The existences of large amounts of near duplicates require fast and accurate automatic near-duplicate detection methods. We have designed a coarse-to-fine near duplicate detection framework to speed-up the process and a multi-modal integra-tion scheme for accurate detection. The duplicate pairs are detected with both global feature (partition based color his-togram) and local feature (CPAM and SIFT Bag-of-Word model). The experiment results on large scale data set proved the effectiveness of the proposed design.
文摘Humongous amounts of data bring various challenges to face image retrieval. This paper proposes an efficient method to solve those problems. Firstly,we use accurate facial landmark locations as shape features. Secondly, we utilise shape priors to provide discriminative texture features for convolutional neural networks. These shape and texture features are fused to make the learned representation more robust.Finally, in order to increase efficiency, a coarse-tofine search mechanism is exploited to efficiently find similar objects. Extensive experiments on the CASIAWeb Face, MSRA-CFW, and LFW datasets illustrate the superiority of our method.
文摘The images from a monocular camera can be processed to detect depth information regarding obstacles in the blind spot area captured by the side-view camera of a vehicle.The depth information is given as a classification result“near”or“far”when two blocks in the image are compared with respect to their distances and the depth information can be used for the purpose of blind spot area detection.In this paper,the proposed depth information is inferred from a combination of blur cues and texture cues.The depth information is estimated by comparing the features of two image blocks selected within a single image.A preliminary experiment demonstrates that a convolutional neural network(CNN)model trained by deep learning with a set of relatively ideal images achieves good accuracy.The same CNN model is applied to distinguish near and far obstacles according to a specified threshold in the vehicle blind spot area,and the promising results are obtained.The proposed method uses a standard blind spot camera and can improve safety without other additional sensing devices.Thus,the proposed approach has the potential to be applied in vehicular applications for the detection of objects in the driver’s blind spot.
基金Supported by the National Natural Science Foundation of China (Grant No.60675017)National Basic Research Program of China (Grant No.2006CB303103)
文摘A shot presents a contiguous action recorded by an uninterrupted camera operation and frames within a shot keep spatio-temporal coherence. Segmenting a serial video stream file into meaningful shots is the first pass for the task of video analysis, content-based video understanding. In this paper, a novel scheme based on improved two-dimensional entropy is proposed to complete the partition of video shots. Firstly, shot transition candidates are detected using a two-pass algorithm: a coarse searching pass and a fine searching pass. Secondly, with the character of two-dimensional entropy of the image, correctly detected transition candidates are further classified into different transition types whereas those falsely detected shot breaks are distinguished and removed. Finally, the boundary of gradual transition can be precisely located by merging the characters of two-dimensional entropy of the image into the gradual transition. A large number of video sequences are used to test our system performance and promising results are obtained.