期刊文献+

基于开源Hadoop的矢量空间数据分布式处理研究 被引量:15

Research on vector spatial data distributed computing using Hadoop projects
下载PDF
导出
摘要 为实现大规模矢量数据的高性能处理,在开源项目Hadoop基础上,设计与开发了一个基于MapReduce的矢量数据分布式计算系统。根据矢量空间数据的特点,通过分析Key/Value数据模型及GeoJSON地理数据编码格式,构建了可存储于Hadoop hdfs的矢量数据Key/Value文本文件格式;探讨矢量数据的MapReduce计算过程,对Map数据分片、并行处理过程及Reduce结果合并等关键步骤进行了详细阐述;基于上述技术,建立了矢量数据分布式计算原型系统,详细介绍系统组成,并将其应用于处理关中地区1∶10万土地利用矢量空间数据,取得较好效果。 The paper designs a vector spatial data distributed computing system based on Open Source Hadoop Projects, in or- der to satisfy the needs of massive vector data. According to the characteristics of the vector spatial data, Key/Value data model and GeoJSON data format, the paper brings forward a distributed Key/Value storage method for vector spatial data based on HDFS. The key techniques on how to computing large-scale vector spatial data based on MapReduce are elaborated in detail, in- cluding data partitioning and parallel processing mechanism of Map step, results merging of Reduce step. A vector spatial data distributed computing prototype system is developed using Open Source Hadoop projects and applied to deal with the 1 : 100, 000 land use data of Guanzhong area in China. The evaluation result indicates that the Hadoop MapReduce can significantly leverage the performance of vector spatial data analysis, especially when more computing nodes are used.
出处 《计算机工程与应用》 CSCD 2013年第16期25-29,共5页 Computer Engineering and Applications
基金 国家自然科学基金(No.41101364) 中国科学院地理科学与资源研究所"一三五"战略科技计划项目(No.2012ZD010)
关键词 矢量空间数据 KEY Value GeoJSON APACHE HADOOP MAPREDUCE 分布式处理 vector spatial data Key/Value GeoJSON Apache Hadoop MapReduce distributed computing
  • 相关文献

参考文献9

  • 1Yang H, Dasdan A, Hsiao R L, et al.Map-reduce-merge: sim- plified relational data processing on large clusters[C]//Zhou L Z, Ling T W.Proceedings of International Confererlce on Management of Data, Beijing.USA: Association for Computing Machinery, 2007 : 1029-1040.
  • 2Zhang Shubin, Han Jizhong, Liu Zhiyong, et al.SJMR: paral- lelizing spatial Join with MapReduce on clusters[C]//Steding T.Proceedings of International Conference on Cluster Com- puting and Workshops, New Orleans, LA.USA: IEEE Com- puter Society, 2009 : 1-4.
  • 3Wang Kai, Han Jizhong, Tu Bibo, et al.Accelerating spatial data processing with MapReduce[C]//Jiang H.Proceedings of the 16th International Conference on Parallel and Distributed Systems, Shanghai.USA: IEEE Computer Society, 2010: 229-236.
  • 4范建永,龙明,熊伟.基于HBase的矢量空间数据分布式存储研究[J].地理与地理信息科学,2012,28(5):39-42. 被引量:39
  • 5Stonebraker M.SQL databases v.NoSQL databases[J].Commu- nications of the ACM,2010,53(4) : 10-11.
  • 6Cary A, Sun Zhengguo, Hristidis V, et al.Experienees on pro- cessing spatial data with MapReduce[J].Seientifie and Statis- tical Database, 2009: 302-319.
  • 7张书彬,韩冀中,刘志勇,王凯.基于MapReduce实现空间查询的研究[J].高技术通讯,2010,20(7):719-726. 被引量:15
  • 8冯敏,尹芳,诸云强,宋佳.基于MapReduce的分布式地形数据计算研究[J].华中科技大学学报(自然科学版),2011,39(S1):24-27. 被引量:9
  • 9高昂.面向空间数据的分布式计算服务研究[D].北京:中科院中国科学院研究生院(地理科学与资源研究所),2011.

二级参考文献39

  • 1Kriegel H P,Brinkhoff T,Schneider R.Efficient spatial query processing in geographic database systems.Data Engineering Bulletin,1993,16:10-15.
  • 2DeWitt D,Gray J.Parallel database systems:the future of high performance database systems.Communications of the ACM,1992,35:85-98.
  • 3Dittrich J,Seeger B.Data redundancy and duplicate detection in spatial join processing.In:Proceedings of the 16th International Conference on Data Engineering,San Diego,CA,USA,2000.535-546.
  • 4Zhang S,Han J,Liu Z,et al.Parallelizing spatial join with MapReduce.In:Proceedings of the 2009 IEEE International Conference on Cluster Computing,New Orleans,Louisiana,USA,2009.
  • 5U.S.Bureau of the Census.TIGER/Line files(TM),2007 technical documentation.Washington,DC,USA,2007.
  • 6Mckee L.Building the GSDI.Wayland,USA:The Open GIS Consortium,1996.
  • 7Patel J,Yu J,Kabra N,et al.Building a scalable geo-spatial DBMS:technology,implementation,and evaluation.In:Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data,Tucson,Arizona,1997.336-347.
  • 8Dean J,Ghemawat S.MapReduce:simplified data processing on large clusters.In:Proceedings of 6th Symposium on Operating System Design and Implementation,San Francisco,CA,2004.137-150.
  • 9Wikipedia.MapReduce.http://en.wikipedia.org/wiki/Map/reduce,2008.
  • 10Yang H,Dasdan A,Hsiao R L,et al.Map-reduce-merge:simplified relational data processing on large clusters.In:Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data,New York,NY,USA,2007.1029-1040.

共引文献57

同被引文献137

引证文献15

二级引证文献129

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部