摘要
在图像中识别在架书脊信息有助于实现更便捷的图书盘点,也可能实现即拿即走等更流畅的读者借阅体验,而书脊区域精确分割是重要前提。区别于普通目标分割,该分割问题的难点在于书脊的密集性及重复性。本文提出一种山字形深层神经网络结构,包含一个编码器及两个解码器。其中一个解码器为书脊分割主通道,另一个则结合书脊边界信息以融入更多的书脊边缘细节。另外,本文建立了一个书脊图像样本集,包含661张图像及15,454个手工标注的书脊实例。实验结果表明,提出的网络模型对书籍一类密集目标图像语义分割具有较高精度,在建立的样本集中具有约90%的均值交并比以及约95%的平均像素精度,性能优于经典的分割模型,验证了提出模型的有效性。
Identifying book spine on-shelves in the image can achieve a more convenient book inventory and is possible to realize a better reader experience, such as take-and-go. Segmentation of the spine region is their important prerequisite. Different from ordinary target segmentation, the difficulty of this segmentation problem lies in that the spines are densely-packed and repeating. In this paper, a mountain-shaped deep neural network structure is proposed, which consists of one encoder and two decoders. One of the decoders is the main segmenting channel for the spine, and the other combines the spine interval information to incorporate more spine edge details. In addition, this research establishes a spine image sample dataset, including 661 images with 15,454 manually labeled polygons. The experimental results show that the proposed network model has high accuracy for semantic segmentation of dense target like book spine images, and has an average intersection ratio of 90% and an average pixel accuracy of 95% in the established dataset. The performance is better than the classical segmentation models, which verifies the effectiveness of the proposed model.
出处
《图像与信号处理》
2020年第4期218-225,共8页
Journal of Image and Signal Processing