摘要
针对气象地面分钟数据要素多样、信息量大、产生频次高等特点,传统的关系型数据库系统在存储和管理数据上出现负载饱满、读写性能不理想等问题。结合对分布式数据库HBase的存储模型的研究,行主键(row key)采用时间加站号的方式设计了气象分钟数据存储结构模型,实现对海量气象数据的分布式存储和元信息管理。对HBase的唯一索引在面对气象业务的复杂查询用例时响应时间过长的问题,使用搜索引擎solr提供的API接口并参考气象业务中的查询用例对相关字段建立辅助索引,来满足业务检索时效。实验结果表明,该系统具有很好的存储能力和检索效率,入库效率最高可达每秒34000条,并且在常规查询用例的结果返回时效达到毫秒级,能够满足大规模气象数据在业务应用中对存储和查询时效的性能要求。
The meteorological ground minute data has characteristics including various elements, large amounts of information and high frequency generation, therefore the traditional relational database system has some problems such as server overload and low read and write performance in data storage and management. With the research of storage model of distributed databases HBase, the database model of the meteorological ground minute data was proposed to achieve distributed storage of massive meteorological data and meta-information management, in which the row key was designed by the method of time plus station number. When processing the complex meteorological query case, the response time of unique index in HBase is too long. To address this defect and meet the requirements of retrieval time efficiency, with considering the query case, API interface offered by search engine solr was used to establish secondary index for related field. The experimental results show that this system has high efficiency of storage and index, the maximum storage efficiency can be up to34 000 records /s. When generic query cases return, the time consuming can be down to millisecond level. This method can satisfy the performance requirements of large-scale meteorological data in business applications.
出处
《计算机应用》
CSCD
北大核心
2014年第9期2617-2621,共5页
journal of Computer Applications
基金
国家气象信息中心青年科技基金资助项目(NMICQJ201310)