摘要
随着深度学习和计算机视觉技术的飞速发展,遥感场景分类任务对预训练模型的微调通常需要大量的计算资源。为了减少内存需求和训练成本,该文提出一种名为“多尺度融合适配器微调(MuFA)”的方法,用于遥感模型的微调。MuFA引入了一个多尺度融合模块,将不同下采样倍率的瓶颈模块相融合,并与原始视觉Transformer模型并联。在训练过程中,原始视觉Transformer模型的参数被冻结,只有MuFA模块和分类头会进行微调。实验结果表明,MuFA在UCM和NWPU-RESISC45两个遥感场景分类数据集上取得了优异的性能,超越了其他参数高效微调方法。因此,MuFA不仅保持了模型性能,还降低了资源开销,具有广泛的遥感应用前景。
With the rapid development of deep learning and computer vision technologies,fine-tuning pretrained models for remote sensing tasks often requires substantial computational resources.To reduce memory requirements and training costs,a method called“Multi-Fusion Adapter(MuFA)”for fine-tuning remote sensing models is proposed in this paper.MuFA introduces a fusion module that combines bottleneck modules with different down sample rates and connects them in parallel with the original vision Transformer model.During training,the parameters of the original vision Transformer model are frozen,and only the MuFA module and classification head are fine-tuned.Experimental results demonstrate that MuFA achieves superior performance on the UCM and NWPU-RESISC45 remote sensing scene classification datasets,surpassing other parameter efficient fine-tuning methods.Therefore,MuFA not only maintains model performance but also reduces resource overhead,making it highly promising for various remote sensing applications.
作者
尹文昕
于海琛
刁文辉
孙显
付琨
YIN Wenxin;YU Haichen;DIAO Wenhui;SUN Xian;FU Kun(Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100190,China;School of Electronic,Electrical and Communication Engineering,University of Chinese Academy of Sciences,Beijing 100049,China;Key Laboratory of Network Information System Technology(NIST),Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100190,China)
出处
《电子与信息学报》
EI
CAS
CSCD
北大核心
2024年第9期3731-3738,共8页
Journal of Electronics & Information Technology
基金
国家重点研发计划(2022ZD0118401)。
关键词
遥感图像
场景分类
参数高效
深度学习
Remote sensing
Scene classification
Parameter efficient
Deep learning