基于Spark的遥感数据分析方法被引量：1

Spark-based Remote Sensing Data Analysis

下载PDF

导出

摘要随着遥感技术的快速发展,遥感数据呈爆炸式增长,给遥感数据计算带来巨大的挑战。采用基于内存计算的Spark分布式计算框架以克服该问题,并选择YARN作为资源调度系统和采用HDFS为分布式存储系统。Spark是一个开源的分布式计算框架,基于弹性分布式数据集(RDD)概念,采用先进的有向无环图执行机制以支持循环数据流操作,通过一次数据导入内存就可以完成多次迭代运算。因而,特别适合基于多次迭代的大数据计算分析方法,相较于每轮迭代需把数据导入内存的Map Reduce有更大的优势。将该计算框架应用于海量遥感数据分析,验证需要多次迭代的奇异值分解(SVD)算法在该数据分析中的有效性。实验表明,随着迭代次数增加,基于Spark的SVD运算效率相对于Map Reduce有明显提高,通常可提高一个数量级。 With the fast development of remote sensing techniques,the volume of acquired data grows exponentially.This brings a big challenge to process massive remote sensing data.In the paper,an in-memory computing framework is proposed to address this problem.Here,Spark is an open-source distributed computing platform with Hadoop YARN as resource scheduler and HDFS as cloud storage system.Spark is based on an abstraction so-called resilient distributed datasets(RDD).and it has an advanced directed acyclic graph(DAG) execution engine to support a cyclic data flow.On the Spark-based platform,the data loaded into memory in the first iteration can be reused in the subsequent iterations.This mechanism makes Spark much suitable for running multi-iteration algorithms compared to MapReduce which has to load data in each iteration.The experiments are carried out on massive remote sensing data using multi-iteration singular value decomposition(SVD) algorithm.The results show that Spark-based SVD can obtain significantly faster computation time than that by MapReduce.usually by one order of magnitude.

作者陈峰科孙众毅池明旻

机构地区复旦大学计算机科学技术学院复旦大学计算机科学技术学院

出处《微型电脑应用》 2015年第8期65-67,6,共3页 Microcomputer Applications

基金国家自然科学基金 (71331005)

关键词大数据计算遥感数据 HADOOP SPARK MAPREDUCE Big Data Computing Remote Sensing Data Hadoop Spark MapReduce

分类号 V249 [航空宇航科学与技术—飞行器设计]

引文网络
相关文献

参考文献12

1姚禹,向晶.全球在轨卫星数量突破1000颗大关[J].中国无线电,2012(11):77-77. 被引量：2
2CUDA, http://www.nvidia.cona/obj ect/cuda home new.html/.
3Xu .I Y, OpenCL-The Open Standard tbr Parallel Programming of Heterogeneous Systems[J]. 2008.
4Chetlur S, Woolley C, Vandermersch P, et al. cudnn: Efficient primitives Ibr deep learning[J], arXiv preprint arXiv: 1410.0759, 2014.
5Borthakur, D."The hadoopdistributed file system: Architecture anddesign," [J]Hadoop ProjectWebsite,2007, 21(11).
6Dean J and Ghemawat. S,"Mapreduce: simplified data processingon large clusters," [C].Conununications of tile ACM,51(1):107-113, 2008.
7Golpayegalfi.N andHalem.M "Cloud computing tbr satellite dataprocessing on high end compute clusters," [J] in Cloud Computing, 2009.CLOUD'09. IEEE International Conference on. IEEE, 2009:88-92.
8Pan.X and Zhang.S, "A remote sensing image cloud processingsystem based on hadoop,'" [J] in Cloud Computing and Intelligent Systems(CCIS), 2012 IEEE 2nd International Conference on, vol. 1. IEEE,2012, pp. 492-494.
9Grossman M, Bretemitz Mr Sarkar V. HadoopCL: MapReduce on Distributed Heterogeneous Platforms through Seamless Integration of Hadoop and OpenCL[.l]. Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th hltemational, 2013:1918-1927.
10Wang Z, Lv P, Zheng C. CUDA on Hadoop: A Mixed Computing Framework for Massive Data Processing[M]//Foundations and Practical Applications of Cognitive Systems and Information Processing. Springer Berlin Heidelberg,2014:253-260.

共引文献1

1刘震,朱耀琴.一种基于Spark的高光谱遥感图像分类并行化方法[J].电子设计工程,2017,25(12):19-22. 被引量：5

同被引文献7

1张军华,王伟,谭明友,崔世凌,陈海云.曲率属性及其在构造解释中的应用[J].油气地球物理,2009,7(2):1-7. 被引量：40
2伍鹏,贺振华,陈学华,焦琛.二维高斯迭代平滑滤波曲率属性及其应用[J].地球物理学进展,2010,25(6):2144-2149. 被引量：14
3赵晓永,杨扬,孙莉莉,陈宇.基于Hadoop的海量MP3文件存储架构[J].计算机应用,2012,32(6):1724-1726. 被引量：28
4王开燕,徐清彦,张桂芳,程某存,李培海.地震属性分析技术综述[J].地球物理学进展,2013,28(2):815-823. 被引量：98
5万波,党琦,杨林.基于HDFS管理MapGIS K9瓦片地图集的研究与实现[J].计算机应用与软件,2013,30(12):232-235. 被引量：8
6方金云,刘羽,姚晓,陈翠婷,张梦菲,肖茁建,张广发.基于Spark的空间数据实时访存技术的研究[J].地理信息世界,2015,22(6):24-31. 被引量：7
7靳凤营,张丰,杜震洪,刘仁义,李荣亚.基于Spark的土地利用矢量数据空间叠加分析方法[J].浙江大学学报（理学版）,2016,43(1):40-44. 被引量：10

引证文献1

1朱丽萍,王建东,李洪奇,赵艳红.Spark框架下地震属性处理方法研究[J].计算机与数字工程,2018,46(8):1620-1626. 被引量：4

二级引证文献4

1陈金焕.基于Spark的近地表速度模型快速层析反演[J].石油物探,2022,61(1):146-155. 被引量：5
2张赛,司冠南,周风余,蔡寅.面向海量数据的相对波速变化计算的并行化方法[J].计算机应用与软件,2022,39(2):21-25. 被引量：1
3廉西猛.基于Spark的地震数据重建方法的并行化[J].科学技术与工程,2023,23(8):3168-3176. 被引量：2
4汤梦瑶,程斐斐.基于Spark的地震数据分析与可视化系统设计与实现[J].现代信息科技,2023,7(18):20-24. 被引量：2

1乔延枫.小卫星姿态控制系统的地面测试方法[J].红外,2002(3):1-9. 被引量：2
2凌琦.Hadoop将更加隐形[J].软件和集成电路,2016(2):66-66.
3付江,程永新.基于天基信息基础设施的数据容灾设想[J].通信技术,2016,49(11):1503-1508. 被引量：1
4熊柏祥,石国凤.流式实时分布式计算系统的设计要点[J].移动信息,2015,0(1):36-36. 被引量：1
5发射消息[J].中国航天,2016(1):60-61.
6胡添元,余雄庆.基于分布/并行计算框架求解多学科设计优化问题[J].航空计算技术,2010,40(2):21-23.
7冯向军,戴金海.一种基于HLA的协同优化计算框架[J].计算机仿真,2008,25(5):275-278. 被引量：2
8肖挺莉,李名杰.基于内存数据库的数据快速处理航显技术[J].中国民航大学学报,2012,30(5):23-26. 被引量：3
9Abdolrahman Dadvand,Mazyar Dawoodian,Boo Cheong Khoo,Reza Esmaily.Spark-generated bubble collapse near or inside a circular aperture and the ensuing vortex ring and droplet formation[J].Acta Mechanica Sinica,2013,29(5):657-666. 被引量：7
10王俊波,曲鑫,任章.基于模糊逻辑的预测再入制导方法[J].北京航空航天大学学报,2011,37(1):63-66. 被引量：6

微型电脑应用

2015年第8期

浏览历史

内容加载中请稍等...

基于Spark的遥感数据分析方法被引量：1

参考文献12

共引文献1

同被引文献7

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于Spark的遥感数据分析方法 被引量：1

参考文献12

共引文献1

同被引文献7

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

基于Spark的遥感数据分析方法被引量：1