基于双通道残差密集网络的红外与可见光图像融合被引量：1

Infrared and Visible Image Fusion Based on Dual Channel Residual Dense Network

下载PDF

导出

摘要为改善红外与可见光融合结果与源图像间的部分细节特征丢失问题,充分提取红外与可见光图像中的特征信息,提出了一种改进的双通道深度学习自编码网络进行红外与可见光图像融合。其中,双通道结构由密集连接和残差连接模块级联构成,并设置一种综合像素、结构相似度和梯度特征保留的损失函数,使该编码器结构可以充分提取红外与可见光图像的多层次特征,在融合层采用空间L1范数和注意力机制对级联双通道特征分别进行融合,最后设计对应的解码器对融合特征图像进行重构,获取最终的融合结果。通过与传统算法以及近年最新的深度学习算法进行实验对比,结果表明该方法在主观和客观上都具有优秀的综合性能。 In the infrared and visible image fusion task,the visible image contains a large amount of texture and background information,while the infrared image contains obvious target information.The two complement each other and can effectively and comprehensively represent the visual information of a scene.In order to improve the problem of partial feature loss between infrared and visible fusion image and source image,and fully extract the feature information in infrared and visible image,this paper proposes an improved dual channel deep learning auto-encoder network for infrared and visible image fusion.The encoder is composed of three cascaded dual channel layers,and they are composed of the cascaded residual and dense connection modules.The source image is divided into two paths and input the residual connection network and the dense connection network at the same time.The residual connection network has a good effect in highlighting the target features.And the dense connection is good at preserving the texture details of the source image,so the encoder structure can fully extract the multi-level features of infrared and visible images.In the design of fusion layer,the spatial L1 norm and the channel attention mechanism are respectively used to fuse the cascades of residuals and dense channel features.The spatial L1 norm fusion strategy uses the L1 norm to calculate the value of activity level measurement and lays more emphasis on the fusion of spatial information.The channel attention mechanism obtains the weight graph of each channel through the global pooling operation.The information contained in each channel can be measured by weight so that the channel information can be fused effectively.Finally,the corresponding decoder is designed to reconstruct the fusion feature image,and the decoder processes the dense and residual features differently according to the characteristics of the encoder.The dense feature layers in high dimension are deeper,so more convolutional sampling layers are used to restore the features;the residual feature layers in low dimension are sharer,so the number of convolutional layers is reduced.In this way,the features of different channels and levels are combined to obtain the final fusion result.In the network training stage,the fusion layer is removed,and 5000 images are randomly selected from the ImageNet data set as the training set for the auto-encoder network.Meanwhile,the sum of pixel loss,gradient loss and structural similarity loss is used as the loss function to guide the optimization of network parameters.In the experimental phase,the network structure and fusion strategy of the ablation experiment.In terms of network structure,the comparison with single residual or dense channel network proves that the two-channel network structure indeed improves the feature extraction ability.In terms of fusion strategy,the comparison with classical fusion strategies such as addition,mean and maximum proves that the dual path fusion strategy can give play to the advantages of the dual channel structure.It can effectively integrate the salient features and detail features of the source image.Finally,the proposed method is compared with the traditional and the latest deep learning algorithms in recent years.The results show that the proposed method can better reflect the target features and background contour information subjectively,and can maintain the information balance between infrared and visible images in most fusion scenes,so as to obtain high-quality fusion images.In the objective indicators CC,PSNR and MI is in the lead,the rest of the indicators are also in the middle level,with excellent comprehensive performance.

作者冯鑫杨杰铭张鸿德邱国航 FENG Xin;YANG Jieming;ZHANG Hongde;QIU Guohang(School of Mechanical Engineering,Key Laboratory of Manufacturing Equipment Mechanism Design and Controlof Chongqing,Chongqing Technology and Business University,Chongqing 400067,China)

机构地区重庆工商大学机械工程学院制造装备机构设计与控制重庆市重点实验室

出处《光子学报》 EI CAS CSCD 北大核心 2023年第11期278-289,共12页 Acta Photonica Sinica

基金国家自然科学基金(No.22178036) 重庆市高校创新研究群体项目(No.CXQT21024) 重庆市自然科学基金项目(No.CSTB2022NSCQ-MSX0271)。

关键词红外与可见光图像融合双通道网络残差密集模块注意力机制自编码器 Infrared and visible image fusion Dual channel parallel network Residual dense module Attention model Auto-encoder network

分类号 TP391 [自动化与计算机技术—计算机应用技术]