Semantic segmentation has recently witnessed rapid progress, but existing methods only focus on identifying objects or instances. In this work, we aim to address the task of semantic understanding of scenes with deep ...Semantic segmentation has recently witnessed rapid progress, but existing methods only focus on identifying objects or instances. In this work, we aim to address the task of semantic understanding of scenes with deep learning. Different from many existing methods, our method focuses on putting forward some techniques to improve the existing algorithms, rather than to propose a whole new framework. Objectness enhancement is the first effective technique. It exploits the detection module to produce object region proposals with category probability, and these regions are used to weight the parsing feature map directly. 'Extra background' category, as a specific category, is often attached to the category space for improving parsing result in semantic and instance segmentation tasks. In scene parsing tasks, extra background category is still beneficial to improve the model in training. However, some pixels may be assigned into this nonexistent category in inference. Black-hole filling technique is proposed to avoid the incorrect classification. For verifying these two techniques, we integrate them into a parsing framework for generating parsing result. We call this unified framework as Objectness Enhancement Network (OENet). Compared with previous work, our proposed OENet system effectively improves the performance over the original model on SceneParse150 scene parsing dataset, reaching 38.4 mIoU (mean intersection-over-union) and 77.9% accuracy in the validation set without assembling multiple models. Its effectiveness is also verified on the Cityscapes dataset.展开更多
This paper first introduces the concept of a geogram that captures richer features to represent the objects. The spatiogram contains some moments upon the coordinates of the pixels corresponding to each bin, while the...This paper first introduces the concept of a geogram that captures richer features to represent the objects. The spatiogram contains some moments upon the coordinates of the pixels corresponding to each bin, while the geogram contains information about the perimeter of grouped regions in addition to features in the spatiogram.Then we consider that a convergence process of mean shift is divided into obvious dynamic and steady states,and introduce a hybrid technique of feature description, to control the convergence process. Also, we propose a spline resampling to control the balance between computational cost and accuracy of particle filtering. Finally, we propose a boosting-refining approach, which is boosting the particles positioned in the ill-posed condition instead of eliminating the ill-posed particles, to refine the particles. It enables the estimation of the object state to obtain high accuracy. Experimental results show that our approach has promising discriminative capability in comparison with the state-of-the-art approaches.展开更多
文摘Semantic segmentation has recently witnessed rapid progress, but existing methods only focus on identifying objects or instances. In this work, we aim to address the task of semantic understanding of scenes with deep learning. Different from many existing methods, our method focuses on putting forward some techniques to improve the existing algorithms, rather than to propose a whole new framework. Objectness enhancement is the first effective technique. It exploits the detection module to produce object region proposals with category probability, and these regions are used to weight the parsing feature map directly. 'Extra background' category, as a specific category, is often attached to the category space for improving parsing result in semantic and instance segmentation tasks. In scene parsing tasks, extra background category is still beneficial to improve the model in training. However, some pixels may be assigned into this nonexistent category in inference. Black-hole filling technique is proposed to avoid the incorrect classification. For verifying these two techniques, we integrate them into a parsing framework for generating parsing result. We call this unified framework as Objectness Enhancement Network (OENet). Compared with previous work, our proposed OENet system effectively improves the performance over the original model on SceneParse150 scene parsing dataset, reaching 38.4 mIoU (mean intersection-over-union) and 77.9% accuracy in the validation set without assembling multiple models. Its effectiveness is also verified on the Cityscapes dataset.
基金Project supported by the National Natural Science Foundation of China(No.61073094)
文摘This paper first introduces the concept of a geogram that captures richer features to represent the objects. The spatiogram contains some moments upon the coordinates of the pixels corresponding to each bin, while the geogram contains information about the perimeter of grouped regions in addition to features in the spatiogram.Then we consider that a convergence process of mean shift is divided into obvious dynamic and steady states,and introduce a hybrid technique of feature description, to control the convergence process. Also, we propose a spline resampling to control the balance between computational cost and accuracy of particle filtering. Finally, we propose a boosting-refining approach, which is boosting the particles positioned in the ill-posed condition instead of eliminating the ill-posed particles, to refine the particles. It enables the estimation of the object state to obtain high accuracy. Experimental results show that our approach has promising discriminative capability in comparison with the state-of-the-art approaches.