摘要
近年来,人工智能领域大语言模型和视觉基础模型的显著进展引发了学者们对遥感领域通用人工智能技术的关注,推动了遥感信息处理大模型研究的新范式。遥感大模型也称为遥感预训练基础模型,是一种利用大量的未标注遥感图像来训练大规模深度学习模型的方法,目的是提取遥感图像中的通用特征表示,进而提高遥感图像分析任务的性能、效率和通用性。遥感大模型的研究涉及3个关键因素:预训练数据集、模型参数量和预训练技术。其中,预训练数据集和模型参数量能够随着数据和计算资源的增加而灵活地扩大,预训练技术则是提升遥感大模型性能的关键因素。以遥感大模型的预训练技术为主线,归纳分析了现有的有监督单模态预训练遥感大模型、无监督单模态预训练遥感大模型和视觉-文本联合多模态预训练遥感大模型。最后,对遥感大模型在结合遥感领域知识与物理约束、提高数据泛化性、扩展应用场景以及降低数据成本4个方面,对遥感大模型进行了展望。
In recent years,significant advancements in large language models and visual foundation models in the field of artificial intelligence have attracted scholars'attention to the potential of general artificial intelligence technology in remote sensing.These studies have propelled a new paradigm in the research of large models for remote sensing information processing.Large remote sensing models,also known as pre-trained foundation remote sensing models,are a kind of methodology that employs a vast amount of unlabeled remote sensing images to train large-scale deep learning models.The goal is to extract universal feature representations from remote sensing images,thereby enhancing the performance,efficiency,and versatility of remote sensing image analysis tasks.Research on large remote sensing models involves three key factors,including pre-training datasets,model parameters,and pre-training techniques.Among them,pre-training datasets and model parameters can be flexibly expanded with the increase in data and computational resources,while pre-training techniques are critical for improving the performance of large remote sensing models.This review focuses on the pre-training techniques of large remote sensing models and systematically analyzes the existing supervised single-modal pre-trained large remote sensing models,unsupervised single-modal pre-trained large remote sensing models,and visual-text joint multimodal pre-trained large remote sensing models.The conclusion section provides prospects for large remote sensing models in terms of integrating domain knowledge and physical constraint,enhancing data generalization,expanding application scenarios,and reducing data costs.
作者
张良培
张乐飞
袁强强
ZHANG Liangpei;ZHANG Lefei;YUAN Qiangqiang(State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,Wuhan 430079,China;School of Computer Science,Wuhan University,Wuhan 430072,China;School of Geodesy and Geomatics,Wuhan University,Wuhan 430079,China)
出处
《武汉大学学报(信息科学版)》
EI
CAS
CSCD
北大核心
2023年第10期1574-1581,共8页
Geomatics and Information Science of Wuhan University
基金
国家重点研发计划(2022YFB3903405)。
关键词
遥感大模型
预训练基础模型
多模态基础模型
large remote sensing model
pre-trained foundation model
multi-modal foundation model