期刊文献+

基于改进Swin transformer的遥感图像融合方法

Remote Sensing Image Fusion Method Based on Improved Swin Transformer
下载PDF
导出
摘要 针对现有基于transformer的方法未能充分融合遥感图像多尺度特征的问题,提出一种多光谱-全色融合网络。在融合网络中嵌入一个基于改进Swin transformer的多尺度窗口自注意力模块,在关注全局空间特征的同时,充分融合不同尺寸的特征信息,从而最大程度地保留光谱和空间结构信息。通过不同层级特征的跳跃连接,解码网络预测出原始多光谱图像缺失的纹理部分,最终使用细节注入模型恢复出目标图像。为了提升融合效果,在损失函数中加入了光谱损失和空间结构损失。与其他方法相比,本文提出的方法在WorldView-4、QuickBird和WorldView-2三种卫星数据集的主观视觉效果最好,相比于性能第二的方法,本文方法在三种数据集的相对全局误差指标分别减小了11.99%、0.4%和3.43%。 Remote sensing images are widely used in land monitoring,environmental perception,disaster prediction and urban analysis.Most commercial satellites such as WorldView-4,QuickBird and WorldView-2 are equipped with sensors that can obtain panchromatic images and multispectral images at the same time.Panchromatic images have high spatial resolution but have only one band.The spatial resolution of multispectral images is low due to the bandwidth limitation of the equipment.In order to obtain more accurate details of the measured object,panchromatic image and multispectral image can be fused to generate images with both high spatial resolution and high spectral resolution.Fusion methods of multispectral and panchromatic images can be divided into four categories:multi-resolution analysis method,component substitution method,variational optimization method and deep learning method.Compared with traditional methods,deep learning has stronger feature extraction ability,so it is widely used.Currently,transformer structure is introduced into advanced remote sensing image fusion method.Aiming at the problem that existing methods based on transformer fail to fully integrate multi-scale features of remote sensing images,this paper proposes a multispectral-panchromatic fusion network MSCANet,based on improved Swin transformer.The model extracts features of multispectral images and panchromatic images respectively by using two-flow branches.The downsampled feature images are cascaded and fed into the fusion network.In order to improve the robustness of feature extraction in various complex ground scenes,a Multiscale Swin-transformer with Channel Attention(MSCA)unit is integrated in the fusion part.The unit replaces the MLP part of Swin transformer into a cascade module of multi-scale convolution and channel attention,which can better fuse the feature information of ground objects of different sizes in remote sensing images and use the long-range dependence between regions.The fusion network focus on predicting the high-frequency details lost in multispectral images.Then high frequency details are added to the original image to restore a high resolution multispectral image.Simulation experiment and real experiment of three commercial satellites are conducted.In the experiment of simulation data,the fusion results were evaluated by calculating the difference between the reference image and the simulation dataset.Compared with other methods,MSCANet has the best performance in visual performance and quantitative metrics.Compared with the method with the second performance,the ERGAS index of MSCANet in the three datasets decreased by 11.99%,0.4%and 3.43%,respectively.In the experiment of three real datasets,combining visual effect and quantitative metrics analysis,the result of MSCANet is the best.Ablation experiments were conducted for the three fusion strategies proposed in this paper.The experimental result shows that the injected model used in this paper outperforms the non-injected model.It also proves that the replacement of MLP module in MSCA module and the addition of attention mechanism are conducive to the improvement of fusion performance.Also,the addition of spectral loss and spatial structure loss on the basis of MAE loss is effective for the improvement of spectral fidelity and spatial resolution.In conclusion,the effectiveness of the proposed method was verified by comparison and ablation experiments.In future work,MSCANet is expected to be migrated to the fusion of multispectral image and hyperspectral image,visible image and infrared image,and other similar tasks to improve the generalization of the model proposed in this paper.
作者 李紫桐 赵健康 徐静冉 龙海辉 刘传奇 LI Zitong;ZHAO Jiankang;XU Jingran;LONG Haihui;LIU Chuanqi(School of Electronic Information and Electrical Engineering,School of Perceptual Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)
出处 《光子学报》 EI CAS CSCD 北大核心 2023年第11期248-262,共15页 Acta Photonica Sinica
基金 国家自然科学基金(No.62171283) 上海商用飞机系统工程联合研究基金(No.CASEF-2022-MQ01)。
关键词 遥感 图像融合 多光谱图像 全色图像 Swin transformer Remote sensing Image fusion Multispectral image Panchromatic image Swin transformer
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部