摘要
Storm支持流式数据的高性能实时计算,是一种广泛使用的流式计算框架。在Storm应用的开发中,开发人员需要针对不同的流式数据需求定制开发相应的计算模块,从而导致大量的重复工作,且难以适应数据需求的变动。如何根据流式数据格式和计算方式等数据需求,快速开发Storm应用并配置相应的环境,是提升大部分流式计算应用开发效率的重要问题。提出了流式数据需求描述方法,设计并实现了一种基于Storm的、由数据需求驱动的流式数据实时处理应用辅助开发框架,其根据业务人员描述的领域数据需求自动生成符合数据处理需求的Storm实时数据处理应用。实验表明,该框架能帮助不具备Storm开发能力甚至非软件开发人员快速配置常见的基于Storm的流式计算应用,对于常见的流式数据的实时处理需求具有一定的适应性。
Storm,a widely used stream calculation framework,supports high efficient real-time calculation for stream data.In the development of Storm applications,developers have to write modules for various stream data requirements,causing repetitive work and difficulties in adapting to changes in data requirements.How to develop Storm applications and configure corresponding environment rapidly based on data requirements such as stream data format and calculations is an important research question for improving the efficiency of stream-oriented application development.An approach for describing stream data requirements was proposed in this paper.A framework assisting Storm application development was designed and implemented for business people to describe domain-specific data requirements and gene-rate Storm applications automatically.Experiments show that the framework is able to help non-developers configure and deploy common Storm-based stream calculation applications.The framework is adaptive to common requirements in real-time stream data calculations.
作者
周雯
史雪菲
吴毅坚
赵文耘
ZHOU Wen;SHI Xue-fei;WU Yi-jian;ZHAO Wen-yun(Software School,Fudan University,Shanghai 201203,China;Shanghai Key Laboratory of Data Science,Fudan University,Shanghai 201203,China)
出处
《计算机科学》
CSCD
北大核心
2018年第9期81-88,共8页
Computer Science
基金
上海市科技发展基金项目(16JC1400801)资助
关键词
流式计算
开发框架
数据需求
STORM
Stream calculation
Development framework
Data requirements
Storm