摘要
针对流数据规模大、基本数据处理操作有重叠等特点,提出一种基于位置感知的操作共享优化算法.假设操作预先分配了计算节点资源,在保证流处理实时性和动态性的基础上,利用多个流数据作业间拓扑结构和操作功能的相似性,基于DAG图匹配方法建立操作共享备选集;以输出带宽为约束条件,建立一个操作共享收益优化问题模型,解决了流处理过程中传统操作共享方法对集群的输出带宽影响考虑不充分的问题.仿真实验结果表明,该方法较原有算法更充分地考虑了计算节点制约因素,并提高了共享收益,从而可有效地节省系统计算资源.
We proposed an operation sharing optimization algorithm based on the location aware according to the data flow in large scale,and the basic data processing operations had overlapping features.We assumed that the operations had assigned the resources of compute nodes in advance,in guaranteed the stream processing real-time and dynamic state,to take advantage of the similarity between topologies and operational functions of multiple stream data operations.We established an operation sharing revenue optimized model based on the DAG matching method on the constraint of output bandwidth.This method tackled the insufficiency consideration of traditional operation for operation sharing method to the cluster of output bandwidth.Simulation experiment results show that this method has fuller consideration in the calculation of the node constraints and improves the benefits of sharing,thereby saving the computing resources of the system effectively.
出处
《吉林大学学报(理学版)》
CAS
CSCD
北大核心
2016年第5期1047-1054,共8页
Journal of Jilin University:Science Edition
基金
国家自然科学基金(批准号:61170004)
国家重点研发计划高性能计算专项基金(批准号:2016YFB0201503)
国家深部探测技术与实验研究专项基金(批准号:SinoProbe-09-01)
教育部高等学校博士学科点专项科研基金(批准号:20130061110052)
吉林省科技发展计划重点科技攻关项目(批准号:20140204013GX)
关键词
流数据
分布式流处理
操作共享
共享优化
stream data
distributed stream processing
operation sharing
sharing optimization