摘要
目前语义分割网络存在推理速度慢、轮廓信息缺失和语义信息不充足的问题,使其不适用于航拍图像的语义分割。提出一种交叉注意力混合机制和金字塔注意力机制的解码网络用于航拍图像语义分割。首先,采用MobileNetV2为骨干网络提高实时性推理速度;其次,提出交叉注意力混合机制解决轮廓信息缺失的问题;再次,提出金字塔注意力机制消除卷积神经网络无法捕获长范围语义信息的局限性。最后,实验结果表明,该文网络在单张GTX 3090卡,分辨率为256×256×3的DLRSD(Dense Labeling Remote Sensing Dataset)数据集中,获取73.4%的平均交并比和85.4%的像素精度,实现了196.9帧每秒的推理速度。
The current semantic segmentation network with the problems of slow reasoning speed,lack of contour information and insufficient context dependent information,which weakens its semantic segmentation performance of aerial images.A decoding network of cross attention hybrid mechanism and pyramid attention mechanism was proposed for aerial image semantic segmentation.Firstly,MobileNetV2 was adopted to reduce the amount of calculation and parameters to improve inference speed;Secondly,contour information was obtained through cross-attention hybrid mechanism;Again,pyramid attention mechanism was proposed to eliminate the limitation that convolution cannot capture long-range context information.Finally,the experimental results with DLRSD(Dense Labeling Remote Sensing Dataset)dataset show that our network obtains 73.4%mean intersection over union and 85.4%pixel accuracy.With single GTX 3090 card and 256×256×3 resolution input,the network can achieve 196.9 frames per second inference speed.
作者
袁旭亮
王娟
武明虎
郭力权
刘子杉
陈关海
YUAN Xuliang;WANG Juan;WU Minghu;GUO Liquan;LIU Zieshan;CHEN Guanhai(College of Electrical and Electronic Engineering,Hubei University of Technology,Wuhan 430065,China;Xiangyang Industrial Research Institute of Hubei University of Technology,Xiangyang Hubei 441100,China)
出处
《激光杂志》
CAS
北大核心
2023年第1期122-129,共8页
Laser Journal
基金
国家自然科学基金资助项目(No.62006073)。
关键词
航拍图像语义分割
实时语义分割
金字塔注意力机制
交叉注意力混合机制
semantic segmentation of aerial images
real-time segmentation
pyramid attention mechanism
cross-attention hybrid mechanism