期刊文献+

基于Flink实时计算的自动化流控制算法 被引量:5

Automatic Flow Control Algorithm Based on Flink Real-time Computation
下载PDF
导出
摘要 随着现在各种业务系统的复杂多样化,数据分析的实效性要求也变得越来越高,过去的离线分析很多已经不适用于当前的生产需要,针对于大数据的实时分析变得越来越重要。以当前热门的Flink流处理架构为解析平台,构建了分布式实时采集解析流数据处理架构,针对于不同的数据流,只需要更新配置就可以实现业务数据的解析,极大地减少了代码开发量。为了实现该架构的配置文件更新,重点分析了当前分布式解析架构实时更新配置文件时存在的问题,提出了通过流控制的方法来更改Flink的解析逻辑。流控制的方法能够灵活地实时改变代码的解析逻辑,减少程序重启更新的次数,提高了应用效率。通过完成同样的日志解析入库对比了是否使用流控制算法的效果,实验结果表明加入自动流控制算法的解析架构用更少的时间完成解析结构逻辑的开发和程序部署,并且可以大大地减少延迟入库的日志量,从而最大程度地保证了流的实时性。 With the complexity and diversification of various business systems,the requirement of effectiveness of data analysis is becoming higher and higher.Offline analysis in the past is no longer suitable for current production needs,and real-time analysis for big data is becoming more and more important.Taking the current popular Flink stream processing architecture as the parsing platform,a distributed and real-time processing architecture of collecting and parsing data stream is constructed.For different data streams,business data is analyzed by updating the configuration file for the architecture,which will reduce the amount of code development.In order to realize the configuration file update of this architecture,the problems existing in the current distributed parsing architecture when updating the configuration file in real time are analyzed emphatically,and then the parsing logic of Flink is proposed by flow control method.The method of flow control can flexibly change the parsing logic of the code in real time,reduce the number of program restarting and updating,and improve the efficiency of the application.By completing the same log parsing and storing,the effect of whether to use flow control algorithm is compared.The experiment shows that the analytical framework with automatic flow control algorithm takes less time to complete parsing logic structure of the development and application deployment,and greatly reduces the delay of log volume,thus ensuring the real-time performance of the flow to the greatest extent.
作者 樊春美 朱建生 单杏花 杨立鹏 李雯 FAN Chun-mei;ZHU Jian-sheng;SHAN Xing-hua;YANG Li-peng;LI Wen(China Academy of Railway Sciences,Beijing 100081,China)
出处 《计算机技术与发展》 2020年第8期66-72,共7页 Computer Technology and Development
基金 中国国家铁路集团有限公司2018系统性重大项目(P2018X002) 中国国家铁路集团有限公司2019重大项目(K2019X008)。
关键词 Flink 流处理 SPARK 大数据 分布式 Flink stream processing Spark big data distributed
  • 相关文献

参考文献7

二级参考文献30

共引文献65

同被引文献20

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部