摘要
为了提高遥感影像飞机目标检测的准确性和泛化能力,需要解决背景复杂、尺度多变、目标密集、飞机朝向不确定和特征不明显等问题。但现阶段训练数据量有限,初始训练需要消耗大量算力和时间,容易出现过拟合现象。因此,需要优化模型结构和训练过程。针对上述问题,首先引入一种迁移学习的策略,在Faster-RCNN模型训练之前,加载MS COCO数据集预先训练好的权重,使模型快速收敛,节约了大量的训练时间。然后以ResNet50替代原Faster-RCNN的VGG16特征提取网络,更好地利用深层次的语义信息,在此基础上结合FPN网络,并对原Faster-RCNN的9种锚框增加为15种锚框,通过融合多尺度特征图以获得更丰富的特征表示,从而提高网络检测和定位目标的能力。以RSOD-Dataset数据集为例进行飞机目标检测实验,同时比较不同检测算法的性能;再以NWPU VHR-10数据集验证模型的泛化性和稳定性,实验结果表明:改进的Faster-RCNN在RSOD-Dataset数据集上的精确率为97.54%;在NWPU VHR-10数据集上的精确率为98.27%。通过迁移学习和改进Faster-RCNN的网络结构,可以实现在数据量较少的情况下高精度目标检测,且泛化能力较强,所提方法可以利用于其他目标检测和识别,具有较好的推广意义。
To improve the accuracy and generalization ability of aircraft target detection in remote sensing images,issues such as complex backgrounds,scale variations,dense targets,uncertain aircraft orientations,and subtle features need to be addressed.Due to the limited amount of training data at this stage,initial training consumes a significant amount of computational power and time,and is prone to overfitting.Therefore,it is necessary to optimize the model structure and training process.To address the aforementioned issues,a transfer learning strategy is introduced.Before training the Faster-RCNN model,pre-trained weights from the MS COCO dataset are loaded to enable rapid model convergence,saving a significant amount of training time.Then,the original Faster-RCNN′s VGG16 feature extraction network is replaced with ResNet50 to better utilize deep-level semantic information.On this basis,the network′s ability is enhanced to detect and localize targets by combining FPN networks,increasing the number of anchor boxes from 9 to 15 in the original Faster-RCNN,and by fusing multi-scale feature maps to obtain richer feature representations.Aircraft target detection experiments are conducted using the RSOD-Dataset as an example and the performance of different detection algorithms is compared.Additionally,the generalization and stability of the model using the NWPU VHR-10 dataset is validated.The experimental results demonstrate that the improved Faster-RCNN achieves a precision rate of 97.54%on the RSOD-Dataset,and 98.27%on the NWPU VHR-10 dataset.Through transfer learning and improving the network structure of Faster-RCNN,high-precision target detection with limited data and strong generalization ability can be achieved.The proposed method can be applied to other target detection and recognition tasks,demonstrating good generalization potential.
作者
周绍鸿
方新建
刘鑫怡
张潆丹
严盛
Zhou Shaohong;Fang Xinjian;Liu Xinyi;Zhang Yingdan;Yan Sheng(School of Spatial Information and Surveying and Mapping Engineering,Anhui University of Science and Technology,Huainan,Anhui 232001,China;The 21st Century Space Technology Application Co.,Ltd.,Beijing 100096,China)
出处
《机电工程技术》
2024年第5期172-177,共6页
Mechanical & Electrical Engineering Technology
基金
安徽省煤矿安全大数据分析与预警技术工程实验室开放基金(CSBD2022-2D04)。