遥感基础模型发展综述与未来设想

A comprehensive survey and assumption of remote sensing foundation modal

导出

摘要近年来,遥感智能解译技术快速发展,但大多为专用模型难以泛化到不同任务中,易造成资源浪费。基础模型是一种通用可泛化的解决方案,最近在遥感领域备受关注。尽管目前有大量工作已利用遥感单时相或多时相数据在感知识别和认知预测的部分任务上取得显著成果,但缺乏一个全面的综述给遥感基础模型提供系统概述。因此本文首先从数据、方法和应用角度对现有遥感基础模型的研究进展进行总结,然后通过分析现状存在的局限提出新一代遥感通用预测基础模型的设想,最后针对亟需研究的方向进行探讨与实验,为研究人员提供遥感基础模型过去成果与未来可能性之间的桥梁。 In recent years,remote sensing intelligent interpretation technologies have advanced rapidly,but most established models are task oriented.Therefore,generalizing them to different tasks is difficult,and considerable amounts of resources are wasted.The foundation model is a straightforward approach that has recently attracted considerable interest in the field of remote sensing.Although many works have achieved remarkable results in some tasks for perception recognition and cognitive prediction by using remote sensing single-temporal or multitemporal data,a comprehensive review that provides a systematic overview of the remote sensing foundation model is lacking.Thus,this paper begins by summarizing developments in research on existing remote sensing foundation models from the perspectives of data,methods,and applications.Then,after analyzing the current situation’s limits,we proposed a novel general predictive foundation model.Finally,some essential research areas were highlighted,and past achievements were linked with the future possibilities of remote sensing foundation model.Existing remote sensing foundation models were categorized into three groups according to the data types used(single-temporal/multitemporal)and the tasks involved(perceptual recognition/cognitive prediction):the foundation model of perceptual recognition based on single-temporal data,the foundation model of perceptual recognition based on multitemporal data,and the foundation model of cognitive prediction based on multitemporal data.According to the different self-supervised learning methods adopted,we divided the existing foundation models of perceptual recognition based on single-temporal data into those based on contrastive learning and those based on generative learning.According to the number of tasks,the foundation model of perceptual recognition based on multitemporal data was divided into a single-task-oriented foundation model and a multitask-oriented foundation model.According to different model architectures,the cognitive prediction foundation models based on multitemporal data were divided into transformer-based and graph network-based foundation models.In accordance with the aforementioned categorization,we described the current state of each type of remote sensing foundation models and summarized their data,methods,and application restrictions.Based on the summary and analysis of the existing remote sensing foundation models,a novel general predictive foundation model assumption was proposed.The information pipeline for multidomain or temporal data input and multitime or spatial scale task output can be opened up by extracting stable and generalized time-series hyper-pixel features.This approach enabled the accurate cognitive prediction of the future state.Tens of millions of multiplatform,multitype,multimodal,and multitemporal data were included.By combining the benefits of the transformer model and the graph network,a new foundation model architecture was created,which increased the model’s capacity and enhanced generalization while predicting multitarget interactions in large remote sensing scenes over the long term.In terms of application,the general predictive foundation model can be applied to diverse cognitive prediction tasks with multiple spatial and time scales.Under this assumption,we proposed four exploratory directions:multidomain time series data representation,stable feature extraction,objectenvironment interaction modeling,and multitask interaction reasoning,aiming to provide a reference for researchers exploring remote sensing foundation models.In general,foundation models with generalization ability are crucial to development of remote sensing intelligent interpretation.We provided an overview of current advances in this field by collating the current state of research on remote sensing foundation models.By analyzing the limitations of current remote sensing foundation models in terms of data,methods,and applications,we proposed a novel general predictive foundation model assumption and further clarified four exploratory directions that urgently need breakthroughs under this idea.The follow-up work will make specific and important technological breakthroughs in multidomain time series data representation,stable feature extraction,object-environment interaction modeling,and multitask interaction reasoning.We explored a general remote sensing foundation model integrating perception recognition and cognitive prediction into a single architecture.

作者付琨卢宛萱刘小煜邓楚博于泓峰孙显 FU Kun;LU Wanxuan;LIU Xiaoyu;DENG Chubo;YU Hongfeng;SUN Xian(Key Laboratory of Network Information System Technology(NIST),Chinese Academy of Sciences,Beijing 100190,China;Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100094,China;University of Chinese Academy of Sciences,Beijing 100101,China)

机构地区中国科学院网络信息体系技术重点实验室中国科学院空天信息创新研究院中国科学院大学

出处《遥感学报》 EI CSCD 北大核心 2024年第7期1667-1680,共14页 NATIONAL REMOTE SENSING BULLETIN

基金国家自然科学基金(编号:62201550,62171436) 中国科学院重点部署科研专项(编号:KGFZD-145-23-18) 科技创新2030-“新一代人工智能”重大项目(编号:2022ZD0118401)。

关键词遥感智能解译遥感基础模型通用预测多时相数据多任务 remote sensing intelligent interpretation remote sensing foundation models general prediction multi temporal data multi-task

分类号 TP701 [自动化与计算机技术—检测技术与自动化装置] P2 [天文地球—测绘科学与技术]

引文网络
相关文献

参考文献2

1李治,隋正伟,傅俏燕,郑琎琎,卜桐.基于形态学序列和多源先验信息的城市建筑物高分遥感提取[J].遥感学报,2023,27(4):998-1008. 被引量：4
2田壮壮,张恒伟,王坤,刘盛启,邹前进,赵镇,陈育斌.改进CenterNet在遥感图像目标检测中的应用[J].遥感学报,2023,27(12):2706-2715. 被引量：6

二级参考文献8

1杜培军.高分辨率遥感影像处理进展与城市应用若干实例[J].现代测绘,2020,0(1):1-9. 被引量：5
2苏志鹄,梁勤欧,朱榴骏.不同形态学剖面线遥感影像分类精度比较[J].遥感信息,2015,30(2):36-42. 被引量：2
3李德仁.脑认知与空间认知——论空间大数据与人工智能的集成[J].武汉大学学报（信息科学版）,2018,43(12):1761-1767. 被引量：49
4王俊,秦其明,叶昕,王建华,秦雪彬,杨绣丞.高分辨率光学遥感图像建筑物提取研究进展[J].遥感技术与应用,2016,31(4):653-662. 被引量：29
5林祥国,张继贤.面向对象的形态学建筑物指数及其高分辨率遥感影像建筑物提取应用[J].测绘学报,2017,46(6):724-733. 被引量：57
6杜培军,白旭宇,罗洁琼,李二珠,林聪.城市遥感研究进展[J].南京信息工程大学学报（自然科学版）,2018,10(1):16-29. 被引量：12
7魏东升,周晓光.遥感影像变化检测样本自动抽样[J].遥感学报,2019,23(3):464-475. 被引量：6
8张亚一,费鲜芸,王健,王筱雪,陈周.基于高分辨率遥感影像的建筑物提取方法综述[J].测绘与空间地理信息,2020,43(4):76-79. 被引量：14

共引文献8

1李华.园林景观设计中融合SVM的高分辨率遥感影像道路提取研究[J].自动化与仪器仪表,2024(2):15-19.
2蔡玉林,刘照磊,孟祥磊,王思超,高洪振.基于HRNet和自注意力机制的多源遥感影像水稻提取[J].农业工程学报,2024,40(4):186-193.
3张卢奔.基于无人机遥感和SAM的山区房屋提取研究[J].铁道勘察,2024,50(3):21-27.
4张少杰,彭富明,方斌,张子祥,相福磊,何浩天.基于改进高分辨率网络的多语义图像分割方法[J].机械制造与自动化,2024,53(3):181-184.
5宋宝贵,石卫超,余快.基于多尺度指导的遥感影像建筑物提取网络[J].无线电工程,2024,54(7):1694-1701.
6曹佃龙.基于改进YOLOv5网络模型的无人机影像道路目标检测[J].北京测绘,2024,38(6):936-941.
7张鸿伟,金磊,邹学超,方宇强,尹璐,赵健,兴军亮.面向遥感图像目标感知的群目标检测框架[J].遥感学报,2024,28(7):1802-1811.
8吴锦达,李强.基于深度学习的遥感图像目标检测[J].软件工程,2024,27(10):7-11.

1冯倩,张佳华,邓帆,吴贞江,赵恩灵,郑培鑫,韩杨.基于特征优选和时空融合算法的黄河三角洲湿地类别制图方法研究[J].自然资源遥感,2024,36(2):39-49. 被引量：1
2张雨姝,戴佩玉.一种边缘辅助的卫星影像云修复卷积神经网络[J].测绘地理信息,2024,49(2):81-86.
3詹绍佳,杜翠,张栋,徐天新,宋玉.有砟铁路路基层位变形智能识别方法[J].铁道建筑,2024,64(4):90-95.
4宋宪萍,程恩富.马克思主义的分工理论及其当代发展[J].海派经济学,2023,21(3):1-16. 被引量：2
5范晓倩.非物质文化遗产学科建设刍议[J].原生态民族文化学刊,2023,15(6):133-142. 被引量：1
6董志頔.高炉冲渣水余热回收工艺对比及未来设想[J].中文科技期刊数据库（全文版）自然科学,2016(2):217-217.
7彭彦昆,邹文龙,李荣娇,左杰文,姚现强,姚现琦,杨德勇.掌上式生鲜猪肉新鲜度无损智能检测分级装置[J].农业工程学报,2023,39(18):262-269. 被引量：3
8孙婷瑜.微观与精妙——我对当代油画表达的探讨与实验[J].美术观察,2023(6):152-153.
9何博超.元宇宙的元问题:批判性反思的图绘[J].社会科学文摘,2024(1):118-120.
10王雪燕.成都公园城市建设中居民水生态情怀培育意义和路径探究[J].四川水利,2024,45(1):129-132.

遥感学报

2024年第7期

浏览历史

内容加载中请稍等...

遥感基础模型发展综述与未来设想

参考文献2

二级参考文献8

共引文献8

相关作者

相关机构

相关主题

浏览历史