摘要
气象行业的数据量非常大,系统需要每天处理约有800 G的增量数据,总容量超过1 P的历史数据文件。为了能够及时、快速地存储这些数据,并且能解决气象数据服务的需求,构建一套科学、高效的数据服务系统迫在眉睫。基于Elasticsearch技术,采用元数据的管理方法,设计了一套数据管理系统。按照业务类型和数据名称特点,把所有数据分成了13个大类和260多个元数据类型,设计了相同的元数据模板,便于统一管理。同时对13个大类分别构建了索引,定义了气象行业专业的检索词条,实现了快速定位和访问数根据文件的需求。该设计方法实现了1~2秒钟内可以从5亿个文件库中搜索出某一大类的文件,在2~3秒钟可以更加精确地搜索出某一元数据类型的数据。基于以上设计方案,基本满足了当前的数据服务需要。
The amount of data in the meteorological industry is very large.The system needs to process about 800 G incremental data every day,with a total capacity of more than 1 P of historical data files.In order to store these data in time and quickly,and to meet the needs of meteorological data service,it is urgent to build a set of scientific and efficient data service system.Based on Elasticsearch technology and using metadata management method,we design a data management system.According to the characteristics of business type and data name,all data are divided into 13 categories and more than 260 metadata types.The same metadata template is designed to facilitate unified management.At the same time,the indexes of 13 categories are constructed separately.The retrieval terms of meteorological profession are defined,and the requirement of fast positioning and accessing data files is realized.The proposed method can search a large class of files from 500 million file libraries in 1~2 seconds.In 2~3 seconds,the data of a metadata type can be searched more accurately.Based on the above design scheme,it can basically meet the current data service needs.
作者
张恩红
尹海燕
李高洁
ZHANG En-hong;YIN Hai-yan;LI Gao-jie(Guangdong Meteorological Observation Data Center,Guangzhou 510641,China)
出处
《计算机技术与发展》
2019年第11期154-158,共5页
Computer Technology and Development
基金
国家自然科学基金(41805096)
江苏省自然科学基金(BK20180801)