摘要
如何在深度学习中融合图像的多尺度信息,是基于深度学习的视觉算法需要解决的一个关键问题。本文提出一种基于多尺度交替迭代训练的深度学习方法,并应用于图像的语义理解。算法采用卷积神经网络(CNN)从原始图像中提取稠密性特征来编码以每个像素为中心的矩形区域,将多个尺度图像交替迭代训练,能够捕获不同尺度下的纹理、颜色和边缘等重要信息。在深度学习提取特征分类结果的基础上,提出了一种结合超像素分割的方法,统计超像素块的主导类别,来校正分类错误的像素类别,同时描绘出目标区域边界轮廓,完成最终的语义理解。在Stanford Background Dataset 8类数据集上验证了本文方法的有效性,准确率达到77.4%。
How to fuse multi-scale information of image in deep learning is a key problem to be solved. To solve these problems,this paper proposed a deep learning method based on multi-scale iterative training for image semantic understanding. The algorithm uses the convolution neural network (CNN) to extract dense feature vectors from raw pixel for encoding regions centered on each pixel. The multi-scale itera- tive training captures different scales of textures,colors, edges and other important information. A new method combined with superpixel segmentation is proposed, to estimate the leading category of superpix- el block and to correct the pixel classification error. It can depict the outline of the target area boundary and complete the final semantic understanding. The experiments on Stanford Background Dataset-8 verify the effectiveness of the proposed method,and the accuracy rate is 77.4%.
出处
《光电子.激光》
EI
CAS
CSCD
北大核心
2016年第2期224-230,共7页
Journal of Optoelectronics·Laser
基金
国家自然科学基金(61202168
61403281
61472278)
天津市自然科学基金重点(14JCZDJC31700)
天津市高校发展基金(20120802
20130704)资助项目