摘要
近几年,基于卷积神经网络(convolutional neural network, CNN)的单幅图像动态场景盲去模糊(single image dynamic scene blind deblurring, SIDSBD)方法已经取得了巨大的进步.其成功主要是源于多尺度模型或者多块模型、编解码器架构的设计和残差块结构的设计3个方面.基于此,提出了一种新的多尺度卷积神经网络(multiscale convolutional neural network, MSCNN)来进一步开发多尺度模型、编解码器架构和残差块结构的优势,以实现更高质量的动态场景盲去模糊.首先,受到空间金字塔池化(spatial pyramid pooling, SPP)和多块模型的启发,提出了一种分等级的多块通道注意力机制(hierarchical multi-patch channel attention, HMPCA).提出的HMPCA通过利用特征图的全局特征统计量和局部特征统计量来自适应地对特征图进行逐通道的权重赋值.因为利用了局部信息,因此HMPCA可以被认为是增加了通道方向的感受野,也正因如此,提出的HMPCA能够进一步增强网络的表达能力.其次,不同于现有的多尺度模型,发展出了一种新的多尺度模型,该模型中的每个尺度是由多个编码器和多个解码器构成的.因为HMPCA,使得同一尺度内的编码器和解码器并不完全相同,因此提出的多尺度模型可以被看作是增加了编解码器的深度,因此能够提升每一个尺度的去模糊性能,最终实现更高质量的动态场景盲去模糊.大量的实验结果表明:提出的方法较近几年的一些成功的SIDSBD方法相比,能够复原出更高质量的去模糊图像,在客观的评价指标和主观的视觉效果上均有显著的改进.
Recently, the convolutional neural network(CNN) based single-image dynamic scene blind deblurring(SIDSBD) methods have made significant progress. Their success mainly stems from the multi-scale/multi-patch model and the design of the encoder-decoder architecture and the residual block structure. In this paper, a novel multi-scale CNN(MSCNN) is proposed to further exploit the advantages of the multi-scale model, the encoder-decoder architecture, and the residual block structure, which can achieve higher-quality SIDSBD. First, inspired by the spatial pyramid pooling(SPP) and the multi-patch model, this study put forward a hierarchical multi-patch channel attention(HMPCA) strategy to perform adaptive weight assignment for feature images channel-wise by using the global and local feature statistics. The proposed HMPCA uses local information, which can be considered to enlarge the receptive field in the channel direction and thus can enhance the representational ability of the network. Then, different from existing multi-scale models, a novel multiscale model is built, in which each scale consists of multiple encoders and decoders. Because of the HMPCA, the encoders and decoders at the same scale are not exactly the same. The proposed multi-scale model can be regarded to increase the depth of the encoder-decoder architecture, thus able to improve the deblurring performance of each scale and finally achieve higher-quality blind deblurring for dynamic scenes. Extensive experiments comparing the proposed SIDSBD method with state-of-the-art ones demonstrate the superiority of the method in terms of both qualitative evaluation and quantitative metrics.
作者
唐述
万盛道
谢显中
杨书丽
黄容
顾佳
郑万鹏
TANG Shu;WAN Sheng-Dao;XIE Xian-Zhong;YANG Shu-Li;HUANG Rong;GU Jia;ZHENG Wan-Peng(Chongqing Key Laboratory of Computer Network and Communications Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《软件学报》
EI
CSCD
北大核心
2022年第9期3498-3511,共14页
Journal of Software
基金
国家自然科学基金(61601070,61501074)
重庆市教委科学技术研究重点项目(KJZD-K201800603)
重庆市教委科学技术研究重大项目(KJZD-M201900602)
重庆市基础研究与前沿探索项目(cstc2018jcyjAX0432)
重庆市技术创新与应用发展专项面上项目(cstc2020jscx-msxmX0135)。
关键词
卷积神经网络
动态场景盲去模糊
多尺度模型
通道注意力机制
空间金字塔池化
convolutional neural network(CNN)
blind deblurring for dynamic scene
multi-scale model
channel attention
spatial pyramid pooling(SPP)