期刊文献+

基于门控注意力和多尺度残差融合的双源遥感图像语义分割

Semantic Segmentation of Dual-Source Remote Sensing Images Based on Gated Attention and Multiscale Residual Fusion
原文传递
导出
摘要 遥感图像语义分割是基于地理对象进行遥感图像分析的关键和重要步骤。遥感影像数据与高程数据可形成有效的特征互补,进而提升像素级分割精度。以Swin Transformer为主干网络提取多尺度特征,融合自适应门控注意力机制和多尺度残差融合策略,提出双源遥感图像语义分割模型——STAM-SegNet。自适应门控注意力机制包含门控通道注意力机制和门控空间注意力机制。门控通道注意力通过竞争/合作的机制提升双源数据特征之间的相关性,有效提取双源数据的互补特征。门控空间注意力利用空间上下文信息动态地过滤掉部分高层语义特征,筛选出精确的细节特征。多尺度特征残差融合策略通过多尺度细化和残差结构充分捕获多尺度上下文信息,加强对阴影、边界等细节特征的关注,同时提升模型的训练速度。在Vaihingen和Potsdam数据集上进行实验,所提方法分别取得了89.66%和92.75%的平均F1-score,具有比DeepLabV3+、UperNet、DANet、TransUNet、Swin-UNet等网络更高的分割精度。 The semantic segmentation of remote sensing images is a crucial step in the analysis of geographic-object-based remote sensing images.Combining remote sensing image data with elevation data effectively enhances feature complementarity,thereby improving pixel-level segmentation accuracy.This study proposes a dual-source remote sensing image semantic segmentation model,STAM-SegNet,that leverages the Swin Transformer backbone network to extract multiscale features.The proposed model integrates an adaptive gating attention mechanism and a multiscale residual fusion strategy.The adaptive gated attention mechanism includes gated channel attention and gated spatial attention mechanisms.Gated channel attention enhances the correlation between dual-source data features through competition/cooperation mechanisms,effectively extracting complementary features of dual-source data.In contrast,gated spatial attention uses spatial contextual information to dynamically filter out high-level semantic features and select accurate detail features.The multiscale feature residual fusion strategy captures multiscale contextual information via multiscale refinement and residual structure,thereby emphasizing detailed features,such as shadows and boundaries,and improving the model’s training speed.Experiments conducted on the Vaihingen and Potsdam datasets demonstrate that the proposed model achieved an average F1-score of 89.66%and 92.75%,respectively,surpassing networks such as DeepLabV3+,UperNet,DANet,TransUNet,and Swin-UNet in terms of segmentation accuracy.
作者 郭文 杨虹 刘畅 Guo Wen;Yang Hong;Liu Chang(School of science,Beijing Information Science and Technology University,Beijing 100029,China;Institute of Applied Mathematics,Beijing Information Science and Technology University,Beijing 100101,China)
出处 《激光与光电子学进展》 CSCD 北大核心 2024年第18期450-460,共11页 Laser & Optoelectronics Progress
基金 国家自然科学基金(62171044) 北京市自然科学基金(4222104)。
关键词 遥感图像解译 语义分割 双源遥感数据 自适应门控 注意力机制 多尺度 残差融合 remote sensing image interpretation semantic segmentation dual-source remote sensing data adaptive gating attention mechanism multiscale residual fusion
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部