用MapReduce框架构建虚拟天文台数据节点

Constructing Data Nodes of the China-VO with the MapReduce

下载PDF

导出

摘要 MapReduce是一种大规模分布式并行处理框架,最初被用于互联网服务中的海量数据处理,并逐渐扩展到各个行业领域。目前,虚拟天文台面临着越来越多的地面及空间望远镜观测到的海量天文数据。为了提高中国虚拟天文台数据节点处理海量天文数据的能力,首次提出基于MapReduce框架构建中国虚拟天文台数据节点的方法,并以批量星表交叉认证为例描述了具体实现过程,性能评估结果证明基于MapReduce框架构建虚拟天文台数据节点,可以在性能、扩展性与成本等多方面获得收益。 The MapReduce is a distributed parallel processing model and execution environment for processing large data sets. It was initially applied to handle massive data in web service, but its applications have been extended to a variety of areas. A current project of Virtual Observatory may face an increasingly massive amount of astronomical data from ground-based and space telescopes. In order to improve the processing capacity of the astronomical data center in the China Virtual Observatory, this paper proposes a new approach to construct data nodes using the MapReduce. It translates an astronomical query to a standard SQL query, and then turns the query into a MapReduce job. It finally outputs the results in the standard formats of astronomical data. The MapReduce is integrated into the China Virtual Observatory by using the above three steps. Because cross-identifying between object catalogs takes place only once, the main consumed time in the MapReduce is in indexing and calculating data. We implement object cross-identification based on the MapReduce framework and our performance evaluation shows that the MapReduce-based cross-identification outperforms the traditional approach based on DBMS. Our results also show that the MapReduee-based framework achieves not only good performance but also scalability and low cost.

作者宋烜周薇韩冀中崔辰州

机构地区北京天文馆中国科学院计算技术研究所中国科学院国家天文台

出处《天文研究与技术》 CSCD 2012年第2期150-156,共7页 Astronomical Research & Technology

基金国家自然科学基金(10820002 60920010 90912005)资助

关键词映射化简中国虚拟天文台交叉认证 MapReduce China-VO Cross-identification

分类号 P112 [天文地球—天文学]

引文网络
相关文献

参考文献2

1刘波,崔辰州,赵永恒.构建中国虚拟天文台的天文数据结点[J].天文研究与技术,2006,3(4):355-364. 被引量：2
2高丹,张彦霞,赵永恒.中国虚拟天文台交叉证认工具的开发和应用[J].天文学报,2008,49(3):348-358. 被引量：5

二级参考文献27

1桑健,赵永恒,崔辰州.中国虚拟天文台数据访问服务[J].天文研究与技术,2004,1(3):216-228. 被引量：4
2高丹,张彦霞,赵永恒.海量多波段星表数据的交叉证认的实现[J].天文研究与技术,2005,2(3):186-193. 被引量：9
3[1][LAMOST的科学目标]http://www.lamost.org/xoops/modules/wfchannel/index.php?pagenum=3
4[2]Szalay A,Gray J.Science,2001,293:203
5[4]VizieR]http://vizier.u-strasbg.fr
6[5][Simbad]http://simbad.u-strasbg.fr
7[6][Aladin]http://aladin.u-strasbg.fr
8[7][VIZIER Search]http://archive.stsci.edu/vizier.php
9[8]Ortiz P F,Ochsenbein F,Wicenec A et al.ESO/CDS Data-mining Tool Development Project[C].In:ASP Conf.Ser.(Vol.172).1999.379-382
10[9][NED Batch Jobs]http://nedwww.ipac.caltech.edu/help/batch.html

共引文献5

1宋烜,韩冀中,王凯,高建.用MapReduce实现天文星表交叉认证[J].计算机应用研究,2010,27(10):3740-3743.
2王杰,张海龙,艾力.玉苏甫,托乎提努尔.面向天文领域的本体设计与实现初探[J].天文研究与技术,2016,13(4):506-513.
3张海龙,聂俊,赵青,冶鑫晨,王杰.新疆天文台在线交叉证认服务[J].天文研究与技术,2017,14(3):347-355. 被引量：3
4梁青青,李晖,周彧,陈梅,朱明.FastSky:巡天数据的天图系统[J].电子技术应用,2017,43(11):116-119. 被引量：1
5黄熠,王娟.PSO-GP中文文本情感分类方法研究[J].计算机科学,2017,44(S1):446-450. 被引量：4

1我国首个海洋信息兰维可视化平台建成[J].科技与生活,2011(24):1-1.
2周竹军.“数字南通”基础地理信息系统框架构建[J].矿山测量,2003,31(3):44-46.
3周兴东,朱永福.面向市场的地理信息服务与互联网服务介绍[J].测绘科技通讯,1998,21(4):39-40. 被引量：3
4张登辉,谢斌,俞乐.基于OGC WCS的空间数据节点实现[J].浙江树人大学学报（自然科学版）,2007,7(3):1-4.
5王晓倩,崔辰州,赵永恒.中国虚拟天文台软件集成[J].天文研究与技术,2005,2(4):293-301. 被引量：2
6我国有了“数字海洋”公众版[J].国土资源,2009(6):55-55.
7刘波,崔辰州,赵永恒.构建中国虚拟天文台的天文数据结点[J].天文研究与技术,2006,3(4):355-364. 被引量：2
8邵惠娟,赵永恒.中国虚拟天文台可视化服务[J].天文研究与技术,2004,1(2):152-159. 被引量：1
9朱建伟,袁国辉.基于北斗卫星导航系统的新一代广州市空间测绘基准框架构建[J].工程勘察,2017,45(1):59-63. 被引量：5
10中国虚拟天文台——任务、特点、方案[J].大学科普,2009(4):79-82.

天文研究与技术

2012年第2期

浏览历史

内容加载中请稍等...

用MapReduce框架构建虚拟天文台数据节点

参考文献2

二级参考文献27

共引文献5

相关作者

相关机构

相关主题

浏览历史