期刊文献+

基于Transformer的多尺度遥感语义分割网络 被引量:1

Transformer-based multiscale remote sensing semantic segmentation network
下载PDF
导出
摘要 为了提升遥感图像语义分割效果,本文针对分割目标类间方差小、类内方差大的特点,从全局上下文信息和多尺度语义特征2个关键点提出一种基于Transformer的多尺度遥感语义分割网络(muliti-scale Transformer network,MSTNet)。其由编码器和解码器2个部分组成,编码器包含基于Transformer改进的视觉注意网络(visual attention network,VAN)主干和基于空洞空间金字塔池化(atrous spatial pyramid pooling, ASPP)结构改进的多尺度语义特征提取模块(multi-scale semantic feature extraction module, MSFEM)。解码器采用轻量级多层感知器(multi-layer perception,MLP)配合编码器设计,充分分析所提取的包含全局上下文信息和多尺度表示的语义特征。MSTNet在2个高分辨率遥感语义分割数据集ISPRS Potsdam和LoveDA上进行验证,平均交并比(mIoU)分别达到79.50%和54.12%,平均F1-score(m F1)分别达到87.46%和69.34%,实验结果验证了本文所提方法有效提升了遥感图像语义分割的效果。 For improving the semantic segmentation effect of remote sensing images,this paper proposes a Transformer based multi-scale Transformer network(MSTNet)based on the characteristics of small inter-class variance and large intra-class variance of segmentation targets,focusing on two key points:global contextual information and multi-scale semantic features.The MSTNet consists of an encoder and a decoder.The encoder includes an improved visual attention network(VAN)backbone based on Transformer and an improved multi-scale semantic feature extraction module(MSFEM)based on atrous spatial pyramid pooling(ASPP)to extract multi-scale semantic features.The decoder is designed with a lightweight multi-layer perception(MLP)and an encoder,to fully analyze the global contextual information and multi-scale representations features extracted by utilizing the inductive property of transformer.The proposed MSTNet was validated on two high-resolution remote sensing semantic segmentation datasets,ISPRS Potsdam and LoveDA,achieving an average intersection over union(mIoU)of 79.50%and 54.12%,and an average F1-score(mF1)of 87.46%and 69.34%,respectively.The experimental results verify that the proposed method has effectively improved the semantic segmentation of remote sensing images.
作者 邵凯 王明政 王光宇 SHAO Kai;WANG Mingzheng;WANG Guangyu(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Chongqing Key Laboratory of Mobile Communications Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Engineering Research Center of Mobile Communications of the Ministry of Education,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处 《智能系统学报》 CSCD 北大核心 2024年第4期920-929,共10页 CAAI Transactions on Intelligent Systems
关键词 遥感图像 语义分割 卷积神经网络 TRANSFORMER 全局上下文信息 多尺度感受野 编码器 解码器 remote sensing image semantic segmentation convolutional neural network Transformer global contextual information multiscale receptive field encoder decoder
  • 相关文献

参考文献8

二级参考文献68

共引文献74

同被引文献4

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部