摘要
在过去的几年中,卷积神经网络(CNN)在图像分割方向取得了很大的进展,但是由于卷积运算的局限性,不能很好地处理全局与长距离依赖关系等问题,提出了一种基于Swin-Unet的图像分割模型,通过将Transformer Block模块引入到U-Net网络模型中的编码与解码阶段,并使用更适合二分类的Dice_loss损失函数,来进行特征提取和学习。使用用于城市建筑物遥感卫星图像研究的Inria Aerial Image Labeling数据集进行试验。结果表明,所采用的Swin-Unet模型可以从遥感卫星图像中提取更多的语义信息,从而达到更好的识别效果,IoU分数为0.70。
In the past few years,convolutional neural network(CNN)has made great progress in the direction of image segmentation.However,due to the limitations of convolution operation,it can not deal with the global and long-distance dependence well.An image segmentation model based on Swin-Unet is proposed.By introducing the Transformer Block module into the Encoder and Decoder stages of the U-Net network model,and using the Dice_loss loss function,which is more suitable for binary classification,feature extraction and learning are carried out.The Inria Aerial Image Labeling data set for remote sensing satellite image research of urban buildings was used for experiments.Experiments show that the Swin-Unet model can extract more semantic information from remote sensing satellite images,so as to achieve better recognition effect,and the IoU score is 0.70.
作者
王俊博
孙皓月
刘晓
WANG Junbo;SUN Haoyue;LIU Xiao(Hebei University of Architecture,Zhangjiakou,Hebei 075000)
出处
《河北建筑工程学院学报》
CAS
2024年第3期247-252,共6页
Journal of Hebei Institute of Architecture and Civil Engineering