摘要
数据流编程语言是一种面向领域的编程语言,它能够将计算与通信分离,暴露应用程序的并行性.多核集群中计算、存储和通信等底层资源的复杂性对数据流程序的性能提出了新的挑战.针对数据流程序在多核集群上执行存在资源利用低和扩展性差等问题,利用同步数据流图作为中间表示,文中提出并实现了面向多核集群的层次性流水线并行优化方法.方法包含任务划分与调度、层次流水线调度和数据局部性优化,经过编译优化后生成基于MPI的可并行执行的目标代码.其中任务划分与调度是利用程序中数据和任务并行性将任务映射到计算核上,实现负载均衡和低通信同步开销;层次性流水线调度是利用程序中的并行性构造低延迟流水线调度;数据局部性优化是针对数据访问存在的Cache伪共享做面向存储的优化.实验以X86架构多核处理器组成的集群为平台,选取媒体处理领域的典型应用算法作为测试程序,对层次流水线优化进行实验分析.实验结果表明了优化方法的有效性.
As a domain specific programming model,data flow programming combines the features of media applications and programming languages and offers an attractive way to express the parallelism.However,the complexity of underlying computation,storage and communication in the cluster systems puts forward new challenge to the performance of data flow application.For the problems of current data flow programming,the compiler translates the code to the data flow graph as a middle representation.The paper proposed an efficient data flow compilation framework,namely multi-level pipelining parallelism optimization framework,for cluster architecture to optimize the execution of data flow applications.The framework is composed of three optimization phases:(1)task partitioning and scheduling,which maps a data flow graph to agiven cluster for loading balance and low communication cost,(2)multi-level pipelining scheduling,which constructs a low communication and synchronization cost pipeline scheduling for data flow programs,and(3)data locality aware optimization,which judiciously repeats actor executions to eliminate false sharing and improve locality.We choose multi-core cluster as the experimentplatform and the common algorithms in media processing applications as benchmarks and evaluated the performance of multi-level pipelining parallelism.Our experiments show that its scalability and performance are good.
出处
《计算机学报》
EI
CSCD
北大核心
2014年第10期2071-2083,共13页
Chinese Journal of Computers
基金
国家"八六三"高技术研究发展计划重点项目基金(2012AA010902)
高等学校博士学科点专项科研基金(20120142110089)资助