摘要
遥感图像语义分割是基于地理对象进行遥感图像分析的关键和重要步骤。遥感影像数据与高程数据可形成有效的特征互补,进而提升像素级分割精度。以Swin Transformer为主干网络提取多尺度特征,融合自适应门控注意力机制和多尺度残差融合策略,提出双源遥感图像语义分割模型——STAM-SegNet。自适应门控注意力机制包含门控通道注意力机制和门控空间注意力机制。门控通道注意力通过竞争/合作的机制提升双源数据特征之间的相关性,有效提取双源数据的互补特征。门控空间注意力利用空间上下文信息动态地过滤掉部分高层语义特征,筛选出精确的细节特征。多尺度特征残差融合策略通过多尺度细化和残差结构充分捕获多尺度上下文信息,加强对阴影、边界等细节特征的关注,同时提升模型的训练速度。在Vaihingen和Potsdam数据集上进行实验,所提方法分别取得了89.66%和92.75%的平均F1-score,具有比DeepLabV3+、UperNet、DANet、TransUNet、Swin-UNet等网络更高的分割精度。
The semantic segmentation of remote sensing images is a crucial step in the analysis of geographic-object-based remote sensing images.Combining remote sensing image data with elevation data effectively enhances feature complementarity,thereby improving pixel-level segmentation accuracy.This study proposes a dual-source remote sensing image semantic segmentation model,STAM-SegNet,that leverages the Swin Transformer backbone network to extract multiscale features.The proposed model integrates an adaptive gating attention mechanism and a multiscale residual fusion strategy.The adaptive gated attention mechanism includes gated channel attention and gated spatial attention mechanisms.Gated channel attention enhances the correlation between dual-source data features through competition/cooperation mechanisms,effectively extracting complementary features of dual-source data.In contrast,gated spatial attention uses spatial contextual information to dynamically filter out high-level semantic features and select accurate detail features.The multiscale feature residual fusion strategy captures multiscale contextual information via multiscale refinement and residual structure,thereby emphasizing detailed features,such as shadows and boundaries,and improving the model’s training speed.Experiments conducted on the Vaihingen and Potsdam datasets demonstrate that the proposed model achieved an average F1-score of 89.66%and 92.75%,respectively,surpassing networks such as DeepLabV3+,UperNet,DANet,TransUNet,and Swin-UNet in terms of segmentation accuracy.
作者
郭文
杨虹
刘畅
Guo Wen;Yang Hong;Liu Chang(School of science,Beijing Information Science and Technology University,Beijing 100029,China;Institute of Applied Mathematics,Beijing Information Science and Technology University,Beijing 100101,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2024年第18期450-460,共11页
Laser & Optoelectronics Progress
基金
国家自然科学基金(62171044)
北京市自然科学基金(4222104)。
关键词
遥感图像解译
语义分割
双源遥感数据
自适应门控
注意力机制
多尺度
残差融合
remote sensing image interpretation
semantic segmentation
dual-source remote sensing data
adaptive gating
attention mechanism
multiscale
residual fusion