期刊文献+

Spark Streaming框架下的气象自动站数据实时处理系统 被引量:16

Real-time processing system for automatic weather station data on Spark Streaming architecture
下载PDF
导出
摘要 针对现有气象自动站业务平台面临处理数据不及时、交互式响应慢、统计时效差等问题,提出了使用Spark Streaming技术和HBase解决该问题的方法,将实时计算框架和分布式数据库系统结合起来实现大规模流式数据处理。使用Flume收集自动站数据,Spark Streaming对数据进行流式处理并存储到HBase数据库中,并设计Spark框架下的自动站数据流式入库处理算法和要素极值的实时统计算法,在Cloudera平台下实现了一个高速可靠的实时采集、处理、统计的应用系统。通过对比分析和性能监测,验证了该系统具有低延迟和高吞吐量的优势,运行状况良好,负载均衡。实验结果表明,Spark Streaming用于气象自动站的实时业务处理,数据并行写入HBase、基于HBase的查询和各类要素统计均能达到毫秒级响应,完全能满足自动站数据的应用需求,有效地支撑天气预报业务。 Aiming at these problems of the current data service of Automatic Weather Stations (AWS), including data processing delay, slow interactive response, and low statistical efficiency, a new method based on Spark Streaming and HBase technologies was proposed and introduced to process massive streaming AWS data by inte^ating stream computing framework and distributed database system. Flume was used for data collection, and data processing was conducted by Spark Streaming and data were stored into HBase. In framework of Spark, two algorithms, one for writing streaming AWS data into HBase database, the other for realizing real-time statistical calculation of different observed AWS meteorological elements were designed. Finally, a stable and high-efficient system for real-time acquisition, processing, and statistics of AWS data was developed on Cloudera platform. Based on comparative analysis and running monitoring, performances of the system were confirmed, including low latency, high I/O efficiency, stable running status and excellent load balance. The experimental results show that the response time of Spark Streaming-based real-time operational processing for AWS data can reach to millisecond level, which includes paralleled data writing into HBase, HBase-based data query and statistics on different meteorological elements. The system can fully meet needs of operational applications to AWS data, and provides effective support to weather forecast.
出处 《计算机应用》 CSCD 北大核心 2018年第1期38-43,55,共7页 journal of Computer Applications
基金 中国气象局公益性行业科研专项基金资助项目(201206031)~~
关键词 气象自动站 SPARK STREAMING 流计算 气象数据处理 FLUME Automatic Weather Station (AWS) Spark Streaming stream computing meteorological data processing Flume
  • 相关文献

参考文献14

二级参考文献123

共引文献360

同被引文献178

引证文献16

二级引证文献68

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部